Cambridge Cybercrime Centre: Process for working with our data

This page sets out the steps in the process for obtaining data from the Cybercrime Centre.

Assess whether you will be allowed to use our data

Our datasets are intended for research and analysis into methods to find, understand, investigate and counter cybercrime so your project must clearly fall into this space. Although we do not require reseachers to be academics, there are significant restrictions on using our data for commercial purposes.

Although some of our data was generated internally and so we can make it available for other types of project and for commercial purposes, much of our data has come from third parties and they have only provided us with the data because of the framework under which it will be shared.

Identify the data you wish to use

We describe our various datasets on this page. The descriptions are public and necessarily fairly high level. We do however try to indicate the size of the datasets, the period over which they was collected, along with possible causes of bias.

We strongly enourage the use of prepacked datasets rather than "live feeds". Although a live feed may be superficially attractive it makes it harder to arrange that other researchers can receive the same data that you did -- a key aim of the Cybercrime Centre is to enable reproducible research. If the issue is that you need to collect a further "field" over and above what we supply then talk with us and we may well be able to do this for you.

Read about our legal framework

It is important that you understand the basis on which we share data and the paperwork that will need to be signed.

There's several pages of explanations and FAQs about our agreements, starting here at https://www.cambridgecybercrime.uk/data.html, which you should read before contacting us.

Make an application

You will need to make a formal application to use our data. In the first instance you should send an email to < datarequest AT cambridgecybercrime.uk >.

This email should set out:

The name of your institution: Note that your institution needs to sign the Agreement with the Cybercrime Centre so please provide the exact text which will go into the agreement:; [RESEARCH INSTITUTION] incorporated and registered in [COUNTRY] with company number [X] whose registered office is at [ADDRESS]
The name of the lead researcher: Note that the lead researcher and the institution (but not the nature of the project) will be published on our webpages. It is usually the case that institutions are only prepared to sign agreements where the lead researcher is employed by them -- so if you are a research student or post-doc you probably want to make your supervisor / line manager the lead researcher.
The names of other researchers who you will allow to access the data.: Note that the legal agreement requires everyone who accesses the data must be bound by the agreement and you have to let us know who they are.
Brief details of your project.: A few lines of text is sufficient, but we do need to be able to determine that what you propose fits within our framework.
Which dataset(s) you wish to be able to use.: Be clear!

Get the appropriate paperwork signed

We will assess your application and ask further questions as needed and then we will create a custom version of our "outgoing" agreement. You will need to check this and then arrange for your institution to sign it. "Cambridge" will then countersign and date the agreement.

Download the data that you will be using

Once all the paperwork is in place we will supply you with appropriate credentials for you to download the data.

Keep us in the loop!

You are obliged by the agreement to tell us if you add further researchers to your project.

If the scope of your project changes then you will have to come back to us because that may require the signing of a new agreement.

When you get to the stage of publishing your work then you will need to send us a copy of your paper at the point at which it is submitted for peer review. That allows us to check at the earliest sensible stage that you have not inadvertently breached any of the terms on which you have received our data. If you publish your results on a blog (or similarly) then you should send us a relevant URL immediately the article goes live.

The agreement will run for one year in the first instance. You are responsible for renewing it as necessary (which is likely to be a formality). You should read our page on extending agreements for more about this.

You are required to keep the data we give you secure (and it's a formal requirement that you keep it encrypted at rest). If things go wrong then let us know as soon as possible -- we have obligations to our data suppliers to keep them in the loop as well.

If problems occur

If roadblocks arise at any stage of the process then we will do everything we can to assist. Our aim is to cause more cybercrime research to take place and so we want to iron out the difficulties.

We do of course reserve the right to refuse to make data available for particular projects because we don't believe that they fall within the purposes for which we can share data or because we have concerns about some other aspect of the project. The PI's for our project can be asked to review any decision that is felt to be unreasonable.

Useful links

Formal "outcoming" agreement: https://www.cambridgecybercrime.uk/outboundAgreement-201812.pdf

Guide for data recipients: https://www.cambridgecybercrime/outGuide.html

Legal overview: https://www.cambridgecybercrime.uk/data.html

Department of Computer Science and Technology