<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>Logging and Privacy concerns &#8212; v2.1.0-beta</title>
    
    <link rel="stylesheet" href="../_static/dataone.css" type="text/css" />
    <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../',
        VERSION:     '2.1.0-beta',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true,
        SOURCELINK_SUFFIX: '.txt'
      };
    </script>
    <script type="text/javascript" src="../_static/mathjax_pre.js"></script>
    <script type="text/javascript" src="../_static/jquery.js"></script>
    <script type="text/javascript" src="../_static/underscore.js"></script>
    <script type="text/javascript" src="../_static/doctools.js"></script>
    <script type="text/javascript" src="//cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML"></script>
    <script type="text/javascript" src="../_static/sidebar.js"></script>
    <link rel="author" title="About these documents" href="../about.html" />
    <link rel="index" title="Index" href="../genindex.html" />
    <link rel="search" title="Search" href="../search.html" />
    <link rel="next" title="Cross Domain Indexing and Access for Data and Metadata" href="DataAndMetadata.html" />
    <link rel="prev" title="General Design and Implementation Notes" href="index.html" />
   
  
  <link media="only screen and (max-device-width: 480px)" href="../_static/small_dataone.css" type= "text/css" rel="stylesheet" />

  </head>
  <body role="document">
  
    <div class="version_notice">
      <p>
      <span class='bold'>Warning:</span> These documents are under active 
      development and subject to change (version 2.1.0-beta).<br />
      The latest release documents are at:
      <a href="https://purl.dataone.org/architecture">https://purl.dataone.org/architecture</a>
      </p>
    </div>

    <div class="related" role="navigation" aria-label="related navigation">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="../py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="DataAndMetadata.html" title="Cross Domain Indexing and Access for Data and Metadata"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="index.html" title="General Design and Implementation Notes"
             accesskey="P">previous</a> |</li>
        <li class="nav-item nav-item-0"><a href="../index.html"></a> &#187;</li>
          <li class="nav-item nav-item-1"><a href="index.html" accesskey="U">General Design and Implementation Notes</a> &#187;</li> 
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="logging-and-privacy-concerns">
<h1>Logging and Privacy concerns<a class="headerlink" href="#logging-and-privacy-concerns" title="Permalink to this headline">¶</a></h1>
<p>Design decisions for DataONE have until now been focused on comprehensive and
universal logging for all operations performed on Member Nodes and Coordinating Nodes.
One rationale for this is that data providers have traditionally been unwilling to
replicate their data for distribution by other parties because they have been unable
to get usage metrics for these data.  The current DataONE design for logging is based
on 5 use cases that generally outline the need to provide log information to data
providers (see <a class="reference internal" href="../design/logging.html#logging-use-case-synopsis"><span class="std std-ref">Use Cases to be Supported</span></a> for summary of Use Cases 16, 17, 18,
19, and 20). Under the current <a class="reference internal" href="../design/LoggingSchema.html"><span class="doc">Logging Schema</span></a>, all operations are logged,
recording the user&#8217;s IP address, browser agent, the date and time and type of the
operation, and the user&#8217;s identity if they have authenticated to the system.</p>
<div class="section" id="privacy-concerns">
<h2>Privacy concerns<a class="headerlink" href="#privacy-concerns" title="Permalink to this headline">¶</a></h2>
<p>Recently, discussions have pointed out that there are potential privacy concerns for
data users associated with these logging policies, and that DataONE should consider
cases where truly anonymous access to resources may be warranted.  A comparison has
been made to libraries, whereby patron access to resources is not recorded in order to
avoid having to expose these records to third parties. A similar situation may exist
where a data user does not want a data provider or other third parties to know that
they accessed data in DataONE.  Some example scenarios might include:</p>
<ul class="simple">
<li>A scientist wants to analyze climate change data, but not have the set be traceable
by regulatory bodies until they publish</li>
<li>A scientist wants to analyze a set of data, but not have the set be visible to
possible colleagues</li>
</ul>
<p>There may be more compelling scenarios than these for privacy concerns.</p>
</div>
<div class="section" id="potential-designs">
<h2>Potential designs<a class="headerlink" href="#potential-designs" title="Permalink to this headline">¶</a></h2>
<ul>
<li><dl class="first docutils">
<dt>All Events Logged and users identified</dt>
<dd><ul class="first last simple">
<li>All MNs must implement logging, must provide user
identity in those logs if the user has been authenticated, and must provide
those logs to the CN log aggregation service.</li>
</ul>
</dd>
</dl>
</li>
<li><dl class="first docutils">
<dt>Data Providers can require user identity</dt>
<dd><ul class="first last simple">
<li>Currently, DataONE access control directives (see
<a class="reference internal" href="../design/Authorization.html"><span class="doc">Authorization in DataONE</span></a>) would allow a data provider to specify
that objects are only accessible to &#8216;AuthenticatedUser&#8217;s, which means that their
username, other identifying information, and their IP number are available.
Currently we do not have a specification about what this identifying information
would be, but a reasonable minimal set would be Name and Email.</li>
</ul>
</dd>
</dl>
</li>
<li><dl class="first docutils">
<dt>Data Consumers can request anonymity</dt>
<dd><ul class="first last simple">
<li>Under this scenario, data consumers would not authenticate against DataONE, and
thus their identifying information would not be logged at MN or CN.  However,
under the current specification, their IP number would still be recorded, which
may be sufficient to identify the user.  The specification could be modified to
eliminate the collection of IP numbers for the non-authenticated users, but this
would significantly comprimise our ability to analyze anonymous download
statistics (e.g., geographic breakdown, differentiating web-crawler accesses
versus user accesses, etc.). An alternative would be to create a mechanism to
differentiate typical non-authenticated access (where IP numbers are recorded)
from &#8216;anonymous&#8217; access (where IP numbers are not recorded).</li>
</ul>
</dd>
</dl>
</li>
<li><dl class="first docutils">
<dt>Both require identity and request anonymity</dt>
<dd><ul class="first last simple">
<li>A combination of the last two scenarios, where data providers can demand
identity through authentication, and consumers can insist upon anonymity.  In
this case, any data objects that would otherwise be available to the user but
require identity logging would be omitted from access by anonymous users.</li>
</ul>
</dd>
</dl>
</li>
</ul>
</div>
<div class="section" id="implications-and-issues">
<h2>Implications and Issues<a class="headerlink" href="#implications-and-issues" title="Permalink to this headline">¶</a></h2>
<ul class="simple">
<li>The addition of truly anonymous access complicates the design and implementation of
the APIs, and it makes implementation of the APIs considerably more burdensome for
MNs. This may reduce the number of participating member nodes.</li>
<li>The addition of anonymous access may deter MNs from joining DataONE if they can not
get usage tracking statistics for their data.  Experience with the KNB has indicated
that one of the main reasons that contributors only choose to share metadata and not
data is that they want to be able to guarantee uage reporting data for their data</li>
<li>We need to resolve whether our current concept of &#8216;Public&#8217; access to data (see
<a class="reference internal" href="../design/Authorization.html"><span class="doc">Authorization in DataONE</span></a>), which allows non-authenticated access, also implies that the
IP number of the requesting client not be recorded.</li>
<li>What level of user identification and logging will NSF require from DataONE and other DataNet
partners?  For many data projects, there is often some level of requirement for identification
of the kinds of users and where they come from (particularly to the limited extent that this
can be inferred from data such as IP addresses).</li>
</ul>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
        <div class="sphinxsidebarwrapper">
    <p class="logo"><a href="http://dataone.org">
      <img class="logo" src="../_static/dataone_logo.png" alt="Logo"/>
    </a></p>
  <h3><a href="../index.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">Logging and Privacy concerns</a><ul>
<li><a class="reference internal" href="#privacy-concerns">Privacy concerns</a></li>
<li><a class="reference internal" href="#potential-designs">Potential designs</a></li>
<li><a class="reference internal" href="#implications-and-issues">Implications and Issues</a></li>
</ul>
</li>
</ul>
<h3>Related Topics</h3>
<ul>
  <li><a href="../index.html">Documentation Overview</a><ul>
  <li><a href="index.html">General Design and Implementation Notes</a><ul>
      <li>Previous: <a href="index.html" title="previous chapter">General Design and Implementation Notes</a></li>
      <li>Next: <a href="DataAndMetadata.html" title="next chapter">Cross Domain Indexing and Access for Data and Metadata</a></li>
  </ul></li>
  </ul></li>
</ul>
<div id="searchbox" style="display: none" role="search">
  <h3>Quick search</h3>
    <form class="search" action="../search.html" method="get">
      <div><input type="text" name="q" /></div>
      <div><input type="submit" value="Go" /></div>
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>

    <div class="footer">
      <div id="copyright">
      &copy; Copyright <a href="http://www.dataone.org">2009-2017, DataONE</a>.
        [ <a href="../_sources/notes/LoggingAndPrivacy.txt"
               rel="nofollow">Page Source</a> |
          <a href='https://redmine.dataone.org/projects/d1/repository/changes/documents/Projects/cicore/architecture/api-documentation/source/notes/LoggingAndPrivacy.txt'
            rel="nofollow">Revision History</a> ]&nbsp;&nbsp;
      </div>
      <div id="acknowledgement">
        <p>This material is based upon work supported by the National Science Foundation
          under Grant Numbers <a href="http://www.nsf.gov/awardsearch/showAward?AWD_ID=0830944">083094</a> and <a href="http://www.nsf.gov/awardsearch/showAward?AWD_ID=1430508">1430508</a>.</p>
        <p>Any opinions, findings, and conclusions or recommendations expressed in this
           material are those of the author(s) and do not necessarily reflect the views
           of the National Science Foundation.</p>
      </div>
    </div>
    <!--
    <hr />
     <div id="HCB_comment_box"><a href="http://www.htmlcommentbox.com">HTML Comment Box</a> is loading comments...</div>
     <link rel="stylesheet" type="text/css" href="_static/skin.css" />
     <script type="text/javascript" language="javascript" id="hcb">
     /*<! -*/
     (function()
     {s=document.createElement("script");
     s.setAttribute("type","text/javascript");
     s.setAttribute("src", "http://www.htmlcommentbox.com/jread?page="+escape((typeof hcb_user !== "undefined" && hcb_user.PAGE)||(""+window.location)).replace("+","%2B")+"&mod=%241%24wq1rdBcg%24Gg8J5iYSHJWwAJtlYu/yU."+"&opts=21407&num=10");
     if (typeof s!="undefined") document.getElementsByTagName("head")[0].appendChild(s);})();
      /* ->*/
     </script>
   -->
  </body>
</html>