[plug] preventing data "theft"

Denis Brown dsbrown at cyllene.uwa.edu.au
Mon Dec 11 15:43:27 WST 2006


Dear PLUG list members,

I have a couple of thoughts on this one but would like to tap the 
collective knowledge base a.k.a. be hit with the clue bat :-)

Linux file server holds data for analysis.
The statistics programmes to be used, reside on the server.
The principal researcher has full access to all files.
Server runs ssh, vncserver and precious little else in the way of 
communications pathways!
Firewalled, patched, limited access by IP range, blah, blah, blah.

BUT...

Principal researcher also wants to make data available for analysis by 
others who have accounts on the server but without any possibility of the 
data leaving the server - assistants should not be able to copy the 
data.   Some of those assistants are physically off site so no means of 
verifying their compliance.

One thought I had was to set file permissions on the data area to prevent 
access by assistants - but they need to be able to drill down to 
directories and pass filenames into the stats package(s) for 
analysis.   Ergo at that level at least they will have access and could 
conceivably copy files.   I cannot NOT set them for read permissions 
because the data DOES need to be read by the statistics suites.

Another thought was to offer packaged or scripted analyses but this defeats 
the purpose somewhat of having assistants who can respond to intermediate 
results, fine tune their analyses, etc.

I could assign the assistants' accounts a /dev/nul home directory which 
prevents their staging files there for subsequent copying but that could 
prove problematic if the statistics programmes need to write errors, logs, 
etc into a valid directory.

The principal researcher is worried that just getting the assistants to 
sign off on a legal document might not prevent the leakage of data, however 
innocently.   The data is not "sensitive" in that it is de-identified but 
falling into the Wrong Hands (tm) might lead to out-of-context analyses, 
publications, etc.

I may of course be missing the entirely obvious :-(

TIA for your creativity,
Denis




More information about the plug mailing list