The RDSF is a facility for the long term storage of research data
and is available to researchers from all disciplines. Physically it is a set of
disks and servers housed in two separate data centres. It provides a central
location for data associated with research activities throughout the
University. The data may be accessed as a Windows shared drive
(this works for Macs too) or via NFS on Linux.
An alternative name for the RDSF mainly used internally.
Any PI in the University. You need to apply to be a Data Steward (see below). You can then apply for storage for any Research Projects you have and can arrange access for other members of your research groups.
How much do you want? A Data Steward can currently have up to 5TB of storage free of charge. Above 5TB, the University has implemented a charging policy. For technical reasons the minimum allocation is 100GB per project.
Costs are detailed here. You should include costs for storing more than 5TB of research data in grant applications.
Yes but...
The RDSF is in a secure location, it's shareable and regularly
backed up to tape.
We limit access to each project to a named set of people so only users authorized by the Data Steward can see the data.
The data is stored on multiple disks with protection against
individual component failure. Of course there are events over
which we have no control e.g. a power failure. All project data is stored in two different
locations. Then if we lose one location your
data continues to be available from the other (replicated), this
is totally transparent. The system is very scalable and we monitor
it. If it slows down significantly we can add more resource (e.g.
another pair of filers).
The system is designed for capacity rather than speed, but is comparable with other network drives. If speed is of the essence store a 'working copy' on local storage and make regular copies to your projects(s).
As all projects are replicated, a 5TB project actually occupies 10TB of disk space; for example, a 1KB file will occupy 2KB.
The RDSF was not designed for performance, but purely to offer bulk second tier storage, so we ask that you do not run applications that directly access the RDSF filesystems. If you would like help or advice on changing your usage patterns on the RDSF or with changing to use local disk systems for active data, do contact us.
Yes, all data stored on the RDSF is backed up to our tape library in the Computer Centre. This does not mean that you should use the RDSF to back up all of the data on your PC. See below...
For research data, this is fine and we would encourage you to do so.
As for system backups such as those carried out by utilities such as Apple Time Machine, Linux rsync and PC backup software we're afraid not. This use is outside the remit and configuration of the RDSF.
The problem is copies of system specific files such as applications, preferences, browser caches etc. can cause issues with our backups. In many cases these are rejected by our software, extending the time our backups take which can affect our ability to recover files.
Anyone who needs to back up non research data should make alternative arrangements with Zonal teams.
Problematic files we discover will be removed from our backups. In most cases the files are being rejected anyway.
The most widely used program to do this is called rsync. This is a Unix/Linux utility which is also available for Mac and Windows. It is normally run from the command line eg.
rsync -av mydata /mnt/rdsf_project/mydata/
For Windows there is a command line version cwrsync, and serveral GUI versions including grsync (also available for Mac & Linux). for Windows only, there is Microsoft's SyncToy. A graphical program that may be more suitable for those unfamiliar with Linux. There are also other File Synchronization tools available.
Forever, however, we recycle every 30 days, throwing away all but
the latest copies. So if you lose a file we can get it back as long
as you realize, and let us know, within four weeks. This does not
mean that if a file is older than 30 days we don't have it. As long as a file is still on the system, i.e. it has not been deleted, we will
always have the 'latest' copy, even if it's several years
old. We aim to retain all data for at least 20 years.
A Data Steward is someone who owns a set of data stored on the RDSF
in the form of one or more projects. You decide how much storage a
project needs, for how long the data should be kept, who has access
to it and pay any costs beyond the first 5TB. Normally the Data
Steward would be the PI for any associated University research
projects.
Yes, you can delegate some activities such as adding/removing access to/from users and preparing data for publication to a RA in your research group. We call this a Deputy Data Steward and we recommend a maximum of two per project.
A project is a set of data associated with an activity. It is
stored as a directory on the RDSF and made available as a Windows
(CIFS) share, or Unix/Linux (NFS) mountable directory. A Data
Steward may have many projects. Every project must have a Data
Steward.
Assume your project is called My_Project
On a PC - From Windows Explorer access or map a network drive path \\rdsfcifs.acrc.bris.ac.uk\My_Project.
On a Mac - In the Finder select 'Go>Connect To Server...' from the menu, or press CMD-K, then enter smb://rdsfcifs.acrc.bris.ac.uk/My_Project into the dialogue box.
From the Linux desktop - This can vary, but most now have a 'Connect to Server' from the Places menu.
Using this select "Windows Share", Server: rdsfcifs.acrc.bris.ac.uk, Folder: My_Project
If any of the above ask for a Windows Domain the answer is UOB (all in capitals).
Linux NFS - we recommend Linux users connect to the standard Windows share over SMB.
Go to 'Connect to Server' from the Places menu. Using this select "Windows Share", Server: rdsfcifs.acrc.bris.ac.uk, Folder: My_Project
Any member of the University the Data Steward authorises. Either give us a list when you ask for the Project to be created or let us know later on and we will add or remove access for them.
All members of the project have full access. However, a Data Steward could apply for 2 projects, for example one for the whole research group and one for a small number of users, maybe just the Data Steward and one or two RAs.
So we can support both Windows and Linux users. The Windows & Unix/Linux views of who can do what may look similar on the outside, but are quite different. This is not just an issue for the RDSF and is currently being looked at by the IT Services Unix Virtual Team.
As Data Steward, you can add any PhD students you supervise as users of your project.
For Windows/Mac (CIFS sharing) any of the project users can access
the data as a Windows shared drive from any machine on-site or via
the University VPN. If you'd prefer tighter restrictions, for
example restricting access to a small number of PCs, just ask.
For most Unix/Linux systems it is possible to use Windows sharing as
above. However for NFS you will need to let us have a list of
authorized machines. Be aware that normal Unix/Linux permissions
apply so your local root account will have unrestricted access to
the project. You will also have to use University Standard UIDs and
add the project's Unix group to your system. If you're unsure about
this consult with your Zonal support team.
It's a pre-created folder for you to use when you wish to publish research data via the University Research Data Repository. This is managed by the Research Data Service team and guidance is provided on their website.
Absolutely not. Only data that is to be published via the University Research Data Repository should be stored there. For other data you should create other folders alongside it as necessary.