Databox at BT Innovation 2017

Last week, our team showcased our Databox platform at BT Innovation week at Adastral Park, Ipswich, UK. There were nearly 5000 visitors over 5 days at the show.

Over the week, our team talked to a mix of businesses – a couple of banks, healthcare providers, a housing association, IoT developers, BBC, Sky, EPSRC and BT researchers. We presented three use-cases: fraud detection, personalised adverts and health insurance. Many attendees were able to see use-cases for their sectors – typical questions were “how much will it cost?”, “when will it be ready/commercialised?”, “how centralised local datastore model is more secure than distributed”, “what would be the physical form factor of the product if deployed?”, “Does it require dedicated hardware?”, “Can it run in BT’s home hub”, “how data usage would be analysed”.

In addition to this, many industry attendees mentioned concerns around GDPR (EU – General Data Protection Regulation) and could see how Databox can help industries/businesses to address the personal data storage related issues. Most of the discussions were about the overall concept and were around “how would I do this/that” and discussion on new potential applications. Overall, the project got positive feedback and follow-up invitations from the audience.



Databox 0.1.2 with Docker Swarm Mode and better ARM support

Databox 0.1.2 has been released. Lots of bugs have been squished, ARM7 and 64 support for Raspberry PI and other ARM devices has been improved and the developer experience has been enhanced using local docker images.  This is the best Databox release yet!

Get the software at:

Release notes:

  1. Moving to Docker swarm mode
  2. No need to install nodejs locally anymore
  3. Don’t pass sensitive data in ENV vars
  4. Better ARM support


  • Removed need for local registry for development
  • Moved the container manager into it own repo me-box/databox-cm
  • All platform image are now built locally

Databox posters at EuroSys 2017

There will be 3 PhD student posters on display during the Security and Privacy session at the 11th EuroSys Doctoral Workshop, next week in Belgrade:

If you’re at EuroSys next week, please drop by and talk to one of the students about their exciting research directions and challenges!

Databox Open-Source Software Community Launch Report

The Databox Open-Source Software Community Launch presentation slides are available here:

Overview and Introductions

Databox Architecture

SDK and Interface

Databox Core Components

Driver Developer Guide

Here’s a blog re-post by Gemma Gordon about our recent event in Cambridge last Friday:

Original post:


The team working on the Databox Project hosted their Cambridge open-source community launch on Friday 24th March at Darwin College, Cambridge.

The event served to introduce the motives behind Databox, the structure of the project and to gauge use cases within the community and potential application developers. The team presented the initial release of a working open source Databox platform, which includes basic data collection support through mobile sensing libraries and selected APIs, provides basic data flow policing and privacy policy enforcement, and supports installation and operation of simple personal data processing apps.

Photos courtesy of Hamed Haddadi.

“Can we do detailed, user-centric, contextual analytics at a scalable rate without privacy disasters and legal challenges?”

The morning session began with a formal introduction by Hamed Haddadi into the research project itself, explaining the high-level goals of the project: “Can we do detailed, user-centric, contextual analytics at a scalable rate without privacy disasters and legal challenges?” Richard Mortier followed with a summary of the technical architecture of the Databox and described the driving motive as an open-source, personal networked system, NOT another data silo that acts as a honey pot – the focus being to move computation to where the data is, thus reducing the movement of data itself. Tosh Brown and Yousef Amar then followed with (working!) demonstrations of the Databox SDK and UI, and development of drivers and applications at the container level.

The afternoon session was driven by the attendees, who were all asked to propose applications for and uses of the Databox, with small focus groups facilitating this development.

See my raw notes from the event below.

Thank you to all those who attended, the Databox Project team, and to the staff at Darwin College.

Contribute to the open-source software Databox project

You can contribute to the open-source Databox prototype by visiting the repository and checking out the:

Join the community discussion in the Databox Discourse forum.


The Databox seeks to collate, curate and mediate third-party access to your personal data, whilst creating a user-friendly environment to effectively manage your data. We are generating data more than ever in the form of wearables, social media etc, and our digital footprint can be used by third parties to infer a wealth of information about us. Currently the user has little choice about which data is shared and with whom it is shared – we need a privacy-aware data analytics platform.

Technical Architecture and Design Principles

Performing local data processing and moving data as little as possible has benefits including:

  • context retention
  • reduction of honey pot effects
  • efficiency, and latency reduction
  • more varied sources of accessible data: Twitter, home IoT devices, smartphone sensing etc

Design principles:

  • clear separation of components
    • intercommunication via specified applications
    • use of containers e.g. docker
  • distinct data sources represented by distinct data stores – if one is leaked, only that data is exposed, not all data
  • components are disconnected by default – reduces the attack surface – containers cannot talk to arbitrary cloud services – they will have to go through an export service
  • data flow logged for audit – log store for audit with tools to process logging information
    • how is data being used and moved/exported
    • data processing is transparent to users to allow better control and understanding

Platform components that form the core:

  • container manager: managing apps, starting/stopping containers – UI/dashboard
  • log store (separate container currently) to log all actions
  • arbiter: minting tokens, permissions (separate container atm), root level catalogue uses hypercat with nested catalogues
  • export service: data is taken off the box and sent elsewhere – specific set of requirements meaning that no data can leave the box without being permitted to do so by the user

Dynamic components that you may install to interact with services and data:

  • drivers – interact with services e.g. Hue, Twitter – drivers are containers. Interaction via RestAPI with a data store attached for those logs
  • apps process the data, where the computation is. Apps installed as containers with explicit permissions upon installation and provided by the arbiter to allow them to access specific data.

UI and SDK

The SDK provides a user-friendly cloud environment for building Databox applications quickly, and finding approved applications to use on your own Databox – you simply require a GitHub login to access it. The graphical programming environment allows you drag in and connect nodes, view the function output, and debug if needed. There are other useful details such as built-in virtualisations that allow you to view your data as graphs, lists etc, and application manifests which include any resources your app needs and different levels of functionality to correspond with existing devices. Current applications include Hue lights, a mobile sensing driver and Twitter.

Apps and Drivers

In the Databox, an application can talk interact with 3 areas:

  • stores (both data and driver)
  • arbiter
  • export service

An application includes:

  • app manifests: description, resources required, metadata, textual representation of permissions that the app might request, standard dockerfile (+ databox label, and UI port exposure details) to build app
  • environment variables: urls for containers to connect to, data source metadata in Hypercat format, url for data source store, CA root certificate for the container for use over https (and a private key if you want to host on https server)



Databox Open Source Software Community Launch

Databox Open Source Software Community Launch

Dear Open Source Software enthusiasts

We are inviting you to join us in the open-source community launch of the Databox project on Friday 24th of March at Darwin College, in co-operation with the OCaml Labs in Cambridge.

Databox, started in October 2016 with generous funding by EPSRC, envisions an open-source personal networked device, augmented by cloud-hosted services, that collates, curates, and mediates access to an individual’s personal data by verified and audited, locally-executable, third party applications and services. The Databox will form the heart of an individual’s personal data processing ecosystem, providing a platform for managing secure access to data and enabling authorised third parties to provide the owner with authenticated services, including services that may be accessed while roaming outside the home environment. You can find out more about the project on and view the in-progress code at

We will present the initial release of a working open source Databox platform, which can be run on any device capable of running Docker containers. We endeavour to provide support for ARM devices such as the Raspberry Pi 3 for this release. This initial release will have basic data collection support through mobile sensing libraries and selected APIs, will provide basic data flow policing and privacy policy enforcement, and will support installation and operation of simple personal data processing apps. At this event we want to introduce the Databox to you, and then we hope to engage with security & privacy enthusiasts, data visualisation & analytics fans, and potential app developers to begin building a community and ecosystem around the Databox. We’re open to contributions of all kinds, from improvements to core components, to helping you integrate your favourite IoT devices, to brainstorming what apps and devices you want to see the Databox support!

Venue: Darwin College Cambridge, Silver Street, Cambridge CB3 9EU, United Kingdom

Local Directions and Travel Instructions


Date: Friday 24th  March 2017


9:30-10:00 Registration, Coffee and pastries

10:00  Welcome and Introduction to the Databox project

10:20 Introduction to the open source platform and components (Databox team)

11:00 SDK and Building an example app

11:30 Feature requirements discussions and BoF work groups

12:00 Development session I

13:00 Buffet Lunch

13:30 Development session I continued

14:30 Tea Break

14:45 BoF sessions II

15:00 Feature & App development session II

16:30 Group/individual presentations

17:00 Close

18:30 Drinks and dinner

Two Databox jobs available in Cambridge

We are advertising two researcher jobs at the University of Cambridge Computer Laboratory for up to 2 years with potential for extension.

Research Assistant/Associate in the Systems Research Group at the University of Cambridge Computer Laboratory (closes 14 February 2017)

Research Associate/Senior Research Associate in the Systems Research Group at the University of Cambridge Computer Laboratory (closes 14 February 2017)

Deadline for applications is on 14th February. Please forward the details to potential recent graduates, and those looking for postdocs and senior research positions.


Mozfest 2016 and the Databox Launch

Last week was an eventful start for the Databox project, with a strong presence at Mozilla Festival 2016 followed by the official launch event at the IET, Savoy Place!

At MozFest we presented a Smart Kitchen demo with partners the BBC R&D Labs as part of the “A Tale of Two Cities: Dilemmas in Connected Spaces” session. In this we set up a number of smart utensils to work with a synced videos that guided participants in a bake-off competition. While the winning contestant found the interaction with the devices useful, some found it the smart utensils rather confusing — clearly plenty of scope of work to integrate IoT into future Smart Homes.

We also ran several hack-an-app sessions where participants were able to use our modified Node Red environment to build apps that processed IoT data streams. The level of engagement — some participants stayed for several hours on Sunday! — suggested that there’s plenty of scope for enabling users to build and publish their own apps. Perhaps this is what’s needed to release the IoT’s potential?

We also had our official launch event at the IET Savoy Place– a great venue 🙂 We hosted several community members, SMEs, industry partners, and NGO representatives — thanks to all for coming! — presenting a series of talks followed by demos of the prototype and the software SDK, and finally a panel session between the industry partners and advisory boards discussing the challenges and opportunities faced by the project. Some great feedback, and a chance for everyone to see what they’d missed over the weekend.

Finally, the Databox project was discussed as part of the “Turing Lecture: Data Science, National Security and Systems Challenges” at the Alan Turing Institute. You can find the video of the talks here, with specific discussions about Databox at 47′:10″.

Here are a few pictures from the Mozilla Festival and the launch event. A great job by the team, and thanks to everyone who participated during MozFest and who came along to the launch event. Exciting times ahead!

img_0116   img_0104

img_0103  img_0101


img_0076    img_0085

img_0088   img_0092

img_0093   img_0094


Databox Hack-an-app booth during Mozilla Festival 2016

Come and check out our platform, prototype, and exemplar apps at Mozilla Festival 2016! Databox and BBC will host the Hack-an-app session, and The Kitchen Databox Demo at “A Tale of Two Cities: Dilemmas in Connected Spaces” session.



Friday, 6:00pm-7:30pm, Dilemmas In Connected Spaces, Floor 6 – 603

Saturday, 2:00pm-3:00pm
Dilemmas In Connected Spaces, Floor 6 – 603
The Future Of The Web
Saturday, 3:15pm-4:15pm
Dilemmas In Connected Spaces, Floor 6 – 603
Saturday, 4:30pm-8:00pm
Dilemmas In Connected Spaces, Floor 6 – 601
Sunday, 2:00pm-3:00pm
Dilemmas In Connected Spaces, Floor 6 – 603
Sunday, 2:00pm-5:30pm
Dilemmas In Connected Spaces, Floor 6 – 601

Full details of the kitchen demo are available on: