Work Menu Search

SCHARP DataSpace Designing for Open Science

Half of the 78 million people who have been infected with HIV since its identification in the 1980s have died, making it one of the most merciless epidemics in history. Today, more than 1 in 200 people is infected, with two million new infections recorded each year. While medical advances have made HIV easier to manage, researchers agree that an HIV vaccine is the most likely, and perhaps the only way by which the AIDS pandemic can be stopped.

In the face of all this, the Statistical Center for HIV/AIDS Research & Prevention (SCHARP) is on a mission to help HIV and other vaccine researchers around the world collaborate through data. SCHARP partnered with Artefact and LabKey Software to help define the objective, design a solution, and build the DataSpace, a web tool to empower vaccine investigators to explore data across HIV studies, generate new hypotheses, and accelerate the path to discovery.

The promise of open science: An empowered, aware, collaborative community

HIV has many strains, mutates quickly, infects the very cells meant to fight it, and exposes very little of itself to attack. Researchers have conducted hundreds of HIV vaccine studies over the years, each setting out to explore a specific hypothesis about how it works or how we might fight it. Hidden within and across these studies are other important insights that were not part of the analysis plan. They remain undiscovered because the data can be incomplete, inaccessible, and difficult to stitch together. Researchers have to wait years before they can access their colleagues’ results in published research papers. More importantly, the actual data that produced the papers is often unavailable, relegated to a huge data graveyard where potential clues to vaccine stay buried.

In light of this, the Global HIV Vaccine Enterprise, the group of top HIV experts and funders, called for “a dramatic shift in the culture and practice of sharing research data.” Their top priority? Creating “databases for sharing trial data globally and an insistence on pursuing diverse hypotheses.”

The DataSpace brings researchers information that is easy to access, filter, explore, interpret, and export for further analysis. By using DataSpace, they can identify gaps in current research, review and learn about past work that can help them secure grants, and test new ideas to see if they are worth further exploration.

“You can go in and generate hypotheses. You can have a quick look at things and say, 'this looks interesting, let's follow up and come up with a proper ancillary study proposal.’ It's giving me the freedom to play with the data… It fills a niche that is totally empty right now.”

HIV researcher


A new first step for any researcher

The power of science is the ability to build on previous discoveries. Yet our research showed that researchers might not be aware of some existing research or lack details on how it was performed. To address that need, we created the “Learn About…” section, which serves as an encyclopedia of HIV vaccine studies and immune assays and a first step between embarking on a new research study.

Make a virtual cohort

Cohorts are groups of subjects with something in common – usually they are in the same study and treatment. But in the DataSpace, users can define a cohort across studies using any subject characteristic or threshold of experimental performance they choose. Save it for later and explore any number of ideas with it.

Discover new relationships through a multidimensional view of data

The plot has room for three variables, with special views for comparing groups, comparing experiments, and tracking immune response over time. While it was meant to reveal interesting ideas about immune response, it’s also useful to quickly understand the characteristics of the available data.


Research and insights

Given the complexity of the subject matter, we had to arrange a crash course in vaccine science before we could have a real conversation with world-class researchers. Then we surveyed, interviewed, and observed dozens of people in HIV vaccine science to understand what they do, how they work with data, and their needs and preferences for sharing and collaboration. We went into our field research with an illustrated set of broad data collaboration ideas designed to elicit strong positive and negative reaction that reveal the principles of success.

Scenario-based Ideation

We made interactive prototypes and tested them with real users and scenarios throughout our process to determine if we had organized information properly, if tasks were clear, and what problems needed to be fixed.


Unlike consumer-focused collaboration tools, the DataSpace requires a data-centric and precise design. But our users will not need the DataSpace every day and many are not trained in analytics. So unlike many visualization tools, we needed a design that could provide value on the first use. Our final design enables key tasks in a usable way and feels data-centric without the glitz or adornment that we found eroded trust.

Development, beta, and launch

In order to launch with confidence and quantify the impact of the DataSpace, we conducted a beta. The program demonstrated that the DataSpace delivered great improvements across all goals: access, insight, and collaboration.