Market engagement in the Databox project

A new Report, commissioned by the Horizon Digital Research Institute, looks at the industry and consumer market engagement opportunities of the Databox project. The report looks at market engagement with the Databox concept, as it emerges from a limited number of conversations with project stakeholders, business consultants and industry experts. It seeks to explore its value proposition for individuals, ‘data-rich’ companies and ‘data-poor’ companies.

See the full report on



Creating the living room of the future…

In 2018 we will showcase a live engagement event and demonstrator of the ‘Future of the Living Room’ with the BBC R&D at FACT in Liverpool as part of the States of play exhibition and at the Western Balkans Culture Summit.

The living room will be open to the public at States of Play in May 2018 at FACT, Liverpooland at The Western Balkans Culture Summit in August 2018.

See the details on:

Databox Version: 0.2.0 released!

Databox Version 0.2 has been released in time for our MozFest 2017 Hackathon next week. See the release notes on:


Changes since last version:

Changes to me-box/databox:

Changes to core repositories:

Changes to me-box/core-container-manager:

Changes to me-box/core-arbiter:

Changes to me-box/core-export-service:

Changes to me-box/store-json:

Changes to me-box/store-timeseries:

  • No changes in this version

Changes to me-box/lib-node-databox:

Changes to me-box/lib-go-databox:

  • No changes in this version


Databox Annual Symposium: Fri, November 17, 2017

Please register via:

The Databox project will be have its first birthday this November. A lot has happened since last year, especially on the platform and analytics side. Please join us for the Year 1 roundup of the research, prototype, and demos of the Databox project. the event will be in the IET London and will include fun and interactive demos with personal data and IoT devices, in addition to research highlights, panel discussions, and debates around the next steps for the project over the next 2 years.

Confirmed Invited Speakers:

Joel Obstfeld (Distinguished Engineer , Cisco)

Eleanor Birrell (PhD candidate, Cornell University)

Andrius Aucinas (Head of Engineering at the Hub of All Thingsproject)

Laura James (Technology Principal at Doteveryone)

Guy Cohen (Strategic Relationships Manager, Privitar)


More information is on the Eventbrite page.

In the mean time, please keep in touch with us via the forum

Databox HackDay at MozFest 2017 (Thu. 26 Oct)

As part of Mozilla Festival 2017, we are inviting you to join us in Databox Hackathon event as a joint summit hosted under the Mozilla Festival pre-week events and a BBC R&D community event.

Please register here:

We will present the public release of a working open source Databox platform, which can be run on any device capable of running Docker containers. We endeavour to provide support for ARM devices such as the Raspberry Pi 3 for this release. This initial release has basic data collection support through mobile sensing libraries and selected APIs, provides basic data flow policing and privacy policy enforcement, and supports installation and operation of simple personal data processing apps. At this event we will briefly introduce and demo the Databox to you, then we hope to engage with security & privacy enthusiasts, data visualisation & analytics fans, and potential app developers to begin building a community and ecosystem around the Databox. We’re open to contributions of all kinds, from improvements to core components, to helping you integrate your favourite IoT devices, to brainstorming what apps and devices you want to see the Databox support!

The full schedule is available on the Eventbrite page.

Databox at BT Innovation 2017

Last week, our team showcased our Databox platform at BT Innovation week at Adastral Park, Ipswich, UK. There were nearly 5000 visitors over 5 days at the show.

Over the week, our team talked to a mix of businesses – a couple of banks, healthcare providers, a housing association, IoT developers, BBC, Sky, EPSRC and BT researchers. We presented three use-cases: fraud detection, personalised adverts and health insurance. Many attendees were able to see use-cases for their sectors – typical questions were “how much will it cost?”, “when will it be ready/commercialised?”, “how centralised local datastore model is more secure than distributed”, “what would be the physical form factor of the product if deployed?”, “Does it require dedicated hardware?”, “Can it run in BT’s home hub”, “how data usage would be analysed”.

In addition to this, many industry attendees mentioned concerns around GDPR (EU – General Data Protection Regulation) and could see how Databox can help industries/businesses to address the personal data storage related issues. Most of the discussions were about the overall concept and were around “how would I do this/that” and discussion on new potential applications. Overall, the project got positive feedback and follow-up invitations from the audience.



Databox 0.1.2 with Docker Swarm Mode and better ARM support

Databox 0.1.2 has been released. Lots of bugs have been squished, ARM7 and 64 support for Raspberry PI and other ARM devices has been improved and the developer experience has been enhanced using local docker images.  This is the best Databox release yet!

Get the software at:

Release notes:

  1. Moving to Docker swarm mode
  2. No need to install nodejs locally anymore
  3. Don’t pass sensitive data in ENV vars
  4. Better ARM support


  • Removed need for local registry for development
  • Moved the container manager into it own repo me-box/databox-cm
  • All platform image are now built locally

Databox posters at EuroSys 2017

There will be 3 PhD student posters on display during the Security and Privacy session at the 11th EuroSys Doctoral Workshop, next week in Belgrade:

If you’re at EuroSys next week, please drop by and talk to one of the students about their exciting research directions and challenges!

Databox Open-Source Software Community Launch Report

The Databox Open-Source Software Community Launch presentation slides are available here:

Overview and Introductions

Databox Architecture

SDK and Interface

Databox Core Components

Driver Developer Guide

Here’s a blog re-post by Gemma Gordon about our recent event in Cambridge last Friday:

Original post:


The team working on the Databox Project hosted their Cambridge open-source community launch on Friday 24th March at Darwin College, Cambridge.

The event served to introduce the motives behind Databox, the structure of the project and to gauge use cases within the community and potential application developers. The team presented the initial release of a working open source Databox platform, which includes basic data collection support through mobile sensing libraries and selected APIs, provides basic data flow policing and privacy policy enforcement, and supports installation and operation of simple personal data processing apps.

Photos courtesy of Hamed Haddadi.

“Can we do detailed, user-centric, contextual analytics at a scalable rate without privacy disasters and legal challenges?”

The morning session began with a formal introduction by Hamed Haddadi into the research project itself, explaining the high-level goals of the project: “Can we do detailed, user-centric, contextual analytics at a scalable rate without privacy disasters and legal challenges?” Richard Mortier followed with a summary of the technical architecture of the Databox and described the driving motive as an open-source, personal networked system, NOT another data silo that acts as a honey pot – the focus being to move computation to where the data is, thus reducing the movement of data itself. Tosh Brown and Yousef Amar then followed with (working!) demonstrations of the Databox SDK and UI, and development of drivers and applications at the container level.

The afternoon session was driven by the attendees, who were all asked to propose applications for and uses of the Databox, with small focus groups facilitating this development.

See my raw notes from the event below.

Thank you to all those who attended, the Databox Project team, and to the staff at Darwin College.

Contribute to the open-source software Databox project

You can contribute to the open-source Databox prototype by visiting the repository and checking out the:

Join the community discussion in the Databox Discourse forum.


The Databox seeks to collate, curate and mediate third-party access to your personal data, whilst creating a user-friendly environment to effectively manage your data. We are generating data more than ever in the form of wearables, social media etc, and our digital footprint can be used by third parties to infer a wealth of information about us. Currently the user has little choice about which data is shared and with whom it is shared – we need a privacy-aware data analytics platform.

Technical Architecture and Design Principles

Performing local data processing and moving data as little as possible has benefits including:

  • context retention
  • reduction of honey pot effects
  • efficiency, and latency reduction
  • more varied sources of accessible data: Twitter, home IoT devices, smartphone sensing etc

Design principles:

  • clear separation of components
    • intercommunication via specified applications
    • use of containers e.g. docker
  • distinct data sources represented by distinct data stores – if one is leaked, only that data is exposed, not all data
  • components are disconnected by default – reduces the attack surface – containers cannot talk to arbitrary cloud services – they will have to go through an export service
  • data flow logged for audit – log store for audit with tools to process logging information
    • how is data being used and moved/exported
    • data processing is transparent to users to allow better control and understanding

Platform components that form the core:

  • container manager: managing apps, starting/stopping containers – UI/dashboard
  • log store (separate container currently) to log all actions
  • arbiter: minting tokens, permissions (separate container atm), root level catalogue uses hypercat with nested catalogues
  • export service: data is taken off the box and sent elsewhere – specific set of requirements meaning that no data can leave the box without being permitted to do so by the user

Dynamic components that you may install to interact with services and data:

  • drivers – interact with services e.g. Hue, Twitter – drivers are containers. Interaction via RestAPI with a data store attached for those logs
  • apps process the data, where the computation is. Apps installed as containers with explicit permissions upon installation and provided by the arbiter to allow them to access specific data.

UI and SDK

The SDK provides a user-friendly cloud environment for building Databox applications quickly, and finding approved applications to use on your own Databox – you simply require a GitHub login to access it. The graphical programming environment allows you drag in and connect nodes, view the function output, and debug if needed. There are other useful details such as built-in virtualisations that allow you to view your data as graphs, lists etc, and application manifests which include any resources your app needs and different levels of functionality to correspond with existing devices. Current applications include Hue lights, a mobile sensing driver and Twitter.

Apps and Drivers

In the Databox, an application can talk interact with 3 areas:

  • stores (both data and driver)
  • arbiter
  • export service

An application includes:

  • app manifests: description, resources required, metadata, textual representation of permissions that the app might request, standard dockerfile (+ databox label, and UI port exposure details) to build app
  • environment variables: urls for containers to connect to, data source metadata in Hypercat format, url for data source store, CA root certificate for the container for use over https (and a private key if you want to host on https server)