3. Survey of Needs

3.1. Overview

The goal of the SPADES (System for Physical Activity Data Evaluation and Storage) project is to develop a software service that will make the collection and analysis of high-resolution physical activity data streamlined and accessible for health researchers using open-source algorithms. Dr. Fahd Albinali, with assistance from Jennifer Beaudin, developed a phone-based interview protocol to elicit the perceived needs of potential users of SPADES. The purpose of the interviews was to uncover and prioritize functional and non-functional requirements. We focused on identifying the most important functional aspects such as data cleaning features, merging data from multiple sensors, determining acceptable epochs, monitoring compliance, and popular analytical algorithms. For nonfunctional requirements, we asked about security features and issues related to privacy. We also collected general information about their current methodologies and perceptions about the field.

A short web-based survey served as a starting point to identify and engage researchers and to help develop follow-up questions for the extended interview. Ten of the survey respondents, representing a range of study populations and research specialties, were interviewed by phone with detailed follow-up questions about their current process for collecting, uploading, cleaning, and analyzing physical activity accelerometry data, with a focus on tasks that are particularly tedious and perceived barriers for using raw data and advanced analysis techniques.

The findings from these interviews are being used to develop a requirements and specifications document for the SPADES project.

3.2. Methodology

3.2.1. Survey

A short survey (Appendix A) was developed using SurveyMonkey; this survey included questions about data cleaning and analysis methods currently used, as well as interests and perceived barriers for extending research. An invitation to complete the survey was sent to accelerometry and mobile health electronic lists, as well as to individual researchers identified through publications and attendance at the July 2010 workshop on PA accelerometry.

The survey was intended both to quickly assess the variety of methodologies in use by PA accelerometry researchers, but also to identify health researchers who would be in the target audience for the tool-set to engage and interview for detailed feedback. A subset of the respondents who indicated that they would be willing to participate in a follow-up interview were contacted by email.

The survey received 39 responses, including researchers in the US (12), Ireland (5), Netherlands (3), Canada (2), Sweden (2), France (1), Australia (1), and the UK (1). Twelve respondents did not provide location or contact information. Respondents included one NIH scientist.

In describing their field of interest, six respondents indicated working with children and one with older adults. Supplied field of interest also fell into categories including epidemiology (8), disease (4: diabetes, breast cancer, cardiometabolic risk), obesity and nutrition (6), intervention (9), measurement (5), technical (3: statistics, mobile sensing, physical activity recognition), environment (2: built environment, air quality), and specific behavior (1: walking).

3.2.2. Interview

Interviews were conducted by phone by Jennifer Beaudin, were an hour in length, and were audio recorded with the participants' permission. Participating health researchers were compensated with $25 Starbuck's cards. Appendix B includes the set of questions prepared for the interview.

Twenty-seven respondents indicated willingness to participate in the follow-up phone interview by providing their contact information. Of these, 14 were contacted for a phone interview; the goal was talk with both experienced and junior investigators representing several sub-fields (epidemiology, nutrition, intervention) and with differing levels of technical expertise. Of those contacted, 10 researchers scheduled and completed interviews.

The interview participants represented multiple countries: the US (5), Canada (2), Ireland (2), and Sweden (1). Seven worked, at least part time, with children study populations (including toddlers and preschoolers), and two worked with older adult populations (including chronic disease populations). Two had software development experience and two had written macros for Excel or other software applications.

3.3. Findings

3.3.1. Survey

Survey participants referenced multiple clean-up tools including: SAS (7), Excel (6), Meterware/Meterplus (4), ActiLife/ActiCal (4), Matlab (2), R (2), manufacturer software unspecified (2), and other custom tools (7). Participants also mentioned visually examining data (6), using participant diaries (4), and sending data to an expert (2). Two respondents did not do a cleaning step.

When asked about new algorithms and analysis techniques they would like to try, eight respondents indicated that they were not sure or had no interest and two respondents indicated they do whatever is required for a project. The remaining respondents cited pattern recognition or analysis (7), breaks and bouts in sedentary time (2), sleep/wake patterns (2), and participant-feedback techniques (3). Specific analyses techniques mentioned included measurement error correction models, mean cpm versus median cpm, an algorithm to determine the most useful axes and sensor locations, approaches to comparing data-sets, dynamic nonlinear models, algorithms with covariates without the problem of collinearity, and algorithms which address differences in sensor specification.

A sampling of specific suggestions includes:

"I am looking forward to using pattern recognition techniques. However, at the moment these seem to be still in development and no software is available to my knowledge."

“I'd be interested in machine learning algorithms but, at present, I'm worried about placement issues (e.g., phone orientation) being required to pick up activity. Further, I'd also be worried of a higher likelihood of misclassification because the feedback is too specific, and therefore more prone for people to say, "that's not right."”

“Would like to do more hour by hour analyses and programs that run two or three regression lines for cut-points would like to be able to add questionnaire data to accelerometry via integration of spread sheets.”

“[We] want to be able to locate "sliding windows" of data that can capture, for example, mean values for the top 30 consecutive minutes of a day.”

Survey respondents were asked about barriers they had encountered and their responses were categorized as follows:

Devices (15) lack of standardization across devices (4), cost (3), failure (2), accuracy (1), placement limitations (1), battery life (1), upload time (1), interruptions from phone calls when embedded in phones (1), and the dominance of ActiGraph (1)
Data collection (15) participant wear compliance (6), participant burden (2), accuracy of self-report (2), sensor loss (2), feedback to participants (1), lack of real-time monitoring options (1), and recruiting (1)
Data upload and cleanup (11) cleaning/Non-Wear algorithms (5), storage and dataset size (3), and time requirements (3)
Data analysis (17) limitations of cut-points (6), raw data reduction to usable outcome variables (6), relating data to health outcomes (2), need for specialized expertise (2), and identifying bouts (1)

3.3.2. Interview

Cut-points and Non-Wear Algorithms

As all of the participants were working, to some extent, with special populations (children, older adults, obese individuals), the applicability of published cut-points was a consistent issue.

“I know the ActiHeart has energy prediction equations, but from what we can tell, they were developed on a set of about 10 adults – in our minds, not applicable to children, and particularly not to preschoolers... ” P7

“...our population is usually overweight/obese African American women. The algorithms that come with the software from accelerometer manufacturers do not accurately predict energy expenditure for our population. [expert consultant] has been able to conduct studies to help create population-specific algorithms, which is why we rely on him for all our data cleaning and analyses.” SR

Additionally, lack of certainty about how to define wear time was referenced by two participants. “one of the trickier things that I find... is that everyone has a different way of cleaning and analyzing the data, so that you never know whether it is a true difference between populations... For example, we and a few others have used the, you only remove extended 0s more than 60 minutes in length, but some other groups have used the 30 minutes..., so they are obviously taking out quite a bit more 0 time or sedentary time...We are looking at associations between sedentary behavior and metabolic risk factors and other groups are doing the same thing, but we are not finding the same things, and it could be that it's our data reduction measures...” P1
Data have not typically been made available, so re-analysis using different Non-Wear algorithms cannot be done.

Researchers were conducting studies of 4 to 7 days and typically needed to include at least one weekend day. For different publications or grant applications, data may need to be re-filtered based on wear-time (hours in the day, number of days, inclusion of weekend day). Custom software was not always helpful for this filtering task. “I'm not a programmer, but from working with these two different individuals [programmers], I've seen that if everyone was the same, those programs could have been working smoothly within a month's time. The problem is that if there is any sort of little anomaly, it trips up the program, and when you are working with data-sets of +100 there are going to be quite a few little anomalies in different spots, but you have to make all these exceptions in your program, but then it screws up what was working before.” P1

Examples of anomalies might be a person who wakes up and takes a walk at night [P1] or a preschool child whose parents report them being asleep at a certain time, but whose data suggest wakefulness [P7].

“I am doing that [determining wear time] individually right now. I've written some custom software to try to automate that and I haven't been totally happy with that. It works somewhat, but then I find errors when I go over it myself, oh that's not quite right. I could maybe speed it up a little bit by running that and then scanning it myself... if I were doing hundreds of subjects, I would work on automating it. For now, I am happy that I am getting the results I want by really going over it myself. A lot of it is visual inspection with the logs.” P7

One participant who was actively using custom software for cleaning data and basic analysis noted that tailoring their solution has introduced restrictions on use.

“We have really good computer, statistical, and database support here, I feel very fortunate. On the other hand, it's really expensive and it takes a lot of time and every time we change the darn thing, we've got to redo our software program. And I'm not too helpful to my colleagues who want to set different parameters than I use and they want me to run their data through their software and I just can't, because it would mean changing the software program and getting the guy to spend a week or so and I don't have that money to just do it as a service. [what's different] different epochs, or they will work with seniors and have to different cut-points, and like everyone else in the world, now they are interested in counting sedentary breaks, things along that line.”

Researcher P1 envisioned a software application with Non-Wear and filtering options:

“if there was a program available that was affordable and easy to use and that allowed researchers to manipulate it just a little bit, so you could tailor it to your needs, that would be extremely useful, because with both projects that I was working on, that took months to get that program going and making sure that was working correctly and we were getting what we wanted...I am quite sure that there could be a better or more efficient system out there, if that was presented to me and it was affordable, I wouldn't hesitate.”

Data Cleaning “by hand”

As suggested in the quote from P7 above, several researchers interviewed have developed a system of cleaning data by hand and were satisfied with that procedure for their current research.

“I do it by hand, I'm a dinosaur like that. I remember having a graduate student who spent a long time with the computer science department [on automating process], and I [had my doubts about automating], because I really think it's a good idea to take a good look at your data... you can see, were they really wearing this every day [remembers an incident when grad students didn't personally examine data, included Non-Wear days, and it looked like patients were much less active than they really were] Maybe I'm old-fashioned, but I think it's important to have a really good feel for your data, looking at the graph and looking at the raw data, to do that.” P6

Participants recalled situations where either students or colleagues were less vigilant and caught Non-Wear or technical errors too late. They also may have had negative experiences working with cleaning raw data by hand or with early versions of sensor software. “[a sensor that collected raw data] was actually really irritating because you have HUGE files to look at and you would be trying to order them and count them and you would take the raw file and sort them in ascending value and then count all your from 0 to 100, or whatever it was for sedentary, and then count up the next cut-point.” P2

Several survey participants and interview participants have relied on consultants at other universities (e.g., Dan Heil, Kelli Cain) to help with data cleaning and analysis. However, one junior researcher noted that getting someone skilled can be costly and she would like to be able to do more analysis in-house:

“... I get mountains of data and it's a little too much for me and I have to outsource it. Most recently I uploaded all the data to the server at UCSD, Jim Salas' group and Kelli Cain was able to do the analysis for us and we paid her a consultant fee. But I would like to do everything in my own shop so I can see what's going on with the data, know how to do it myself, and learn what it can do for you.” P9

Batch Data

The lack or limitations of batch processing using standard sensor software was the first issue identified by P5 and was independently raised by two other interview respondents.

“[ActiLife] is not so useful, particularly for batches of data. If you have just one single person, it's okay to use it...[With ActiLife], you can pull out by person, but not by each day or variable, so it's not flexible. If I want these variables, and I want it by day, and I want it between these days, and then you get an output file. Now you get one subject with 3 files, because there are 3 output levels; that's not useful, you have to cut and paste data; that is something I don't like, because there could be something wrong … and it's hard to catch the error.” P5

Several researchers were conducting studies with only up 30 units being released at a time, and reported being able to individually process data files. P6 reported being pleased with being able to do 5 or 6 per day. P2 is a physiotherapist and appreciated being able to examine data and relate it directly to a participant whom she had observed and had familiarity. However, several researchers suggested that real-world studies will only become larger, and therefore software needs to handle batch analysis.

“Most of the people who are doing this now have medium sized studies, so whatever you can do to help with batch processing I think would be appreciated. Files are getting larger and we have a lot of participants [100].” P8

“We didn't figure out a way to throw a lot of subjects together. We are doing an analysis of a lot of subjects together to try to get more power and their [software] is designed to look at one subject at a time and spit out certain results for each subject. We really wanted to compile all the children in one set to get more power out of whatever equations we're trying to develop; the utility of working with a lot of subjects rather than one at a time.” P7

Summary Data versus Raw Data

Although most of the researchers had experience collecting raw data, there was limited experience with raw data analysis. P10, who had the most experience with raw data, described the current state of the field.

“Even with the GTX3 ActiGraph, which is arguably the next step in [collecting raw data], people are still using it like [previous summary data sensors]. Only a handful are using the other axes, only a handful are doing machine learning, [Forrest?] model or Markoff, approaching activities of daily living, those classifications are for the very few... We are going to go out and buy the newest version of the sensors, and we're going to do everything we used to do, and collect at the highest resolution with the most channels on, that battery life will allow us, and do what we always did, publish against existing studies. This is a field new enough that the first thing we'll do is MVPA, the second thing we'll probably do is sedentary time, we may never get to analyzing data for higher level fruit.” P10

Several researchers did report current or recent studies where raw data were collected with sensors such as the GTX3 ActiGraph, were being reduced for summary data analysis, and typically, the original data saved for later analysis.

“We were using [older sensor] and it had to be 1 minute [epochs], but now of course what we are doing is collecting raw data with the GT3X and we will re-integrate to 1 minute to compare to our old data, but we are also looking forward to being able to use pattern recognition like what's happening at the University of Massachusetts, to analyze the data cross-sectionally, using a newer techniques in terms of pattern recognition. Even though we really don't know how to analyze it, we figure there is someone out there smarter than us who is working on this. We knew right from the beginning that 1 minute epochs are too long, but that's all the memory we had at the time we started our study.” P8

Several researchers (P1, P2, P5) described summary data analysis as being more manageable. They also assessed raw data collection and pattern analysis in terms of what would be required from a collection stand-point, concerned that participants would not be willing to wear sensors at non-hip/ wrist locations, at multiple locations, or with particular orientations. This viewpoint was informed by previous experience with burdensome sensors such as the SensewearPro armband. P1 echoed the sentiments of survey respondents in wanting to know whether more involved raw data analysis would be significant from the perspective of health outcomes.

Participants reporting learning summary data analysis techniques as part of their training or by reading literature in their field, but did not feel they had the skillset or abilities to transition to pattern analysis.

“If we want to use it for pattern analysis, we need tools, because none of us – we're physiotherapists – are mathematical... With raw data, how do you get meaningful data with pattern analysis when we're not [mathematically] skilled.” P5

However, most researchers did not express a strong motivation to begin moving toward raw data analysis at this time, being relatively satisfied with their current analysis techniques. The exceptions were researchers P3, P4, and P10, who agreed that a transition needs to be made to raw data collection and pattern analysis.

“just the fact that we continue to use cut-points is troublesome and certainly that's something that Fahd and Stephen are pioneers in, to get us out of cut-points and into activity identification methods, which is where I think we need to be headed...because I don't have a background in engineering, I don't want to be the one developing those [activity recognition] algorithms, but I would love to use them. I have reached the point where I am frustrated with the current cut-point methods and the limitations of that. I am interested in knowing what activities people are doing and not just the intensities and I also have my doubts about whether those intensities are correct; these cut-points the more and more we look at them, the more arbitrary these cut-points are becoming. I would love to know algorithms that could be used to identify activities and I would even be willing to change devices to allow me to do that, assuming those devices could be worn continuously and be used in intervention work. It would have to be done in a way where I could either partner with an engineer-type who would develop these algorithms and collaborate with that person or there was some kind of software set-up that would allow me to do it in an easy interface, a flexible interface.” P4

P4 often serves as source of information to colleagues about techniques beyond cut-points and expressed frustration about collection decisions that limit re-analysis.

“and the data has to be collected at 10 second intervals [to apply the Crouter method]... a bunch of my military colleagues came up to me and I said, well you should use this method, but they had collected in 1 minute epochs and couldn't go back. [saving in ] raw data is really important because there is always going to be some other way of looking at it.” P3

Transparency of Algorithms
Researchers expressed frustration with proprietary algorithms. Researcher P7 related having to call the manufacturers of ActiHeart to get basic information about how a cut-point was developed, including the population and number of participants on which the categories were validated. Researcher P3 described how colleagues were unable to independently replicate energy expenditure algorithms, leading to a loss in confidence in those that were supplied. Researcher P4 had begun searching for adequate sleep actigraphy algorithms, but was finding very little that wasn't proprietary.

“Those algorithms would need to be open-source where I could inspect them myself. That's one of the big frustrations with all these devices, like Actigraph, they don't publish and it's just a black box. I would like to see what they are so we can evaluate them...I would like to see the math behind the algorithms, I would like to know what inputs are going in – so it is a single track accelerometer that's creating or is it 5 accelerometers that are contributing to that particular activity identification. I would like to know the methods used to determine these algorithms. Did they look at a variety of activities or did they focus on the sensitivity of identifying a single activity.... the standard ways of collecting validity about a new device, that sort of thing.” P4

While “transparency” resonated with several researchers, they were not always able to articulate what information they would need. Researcher P10, who in addition to doing his own data collection, has developed software for PA data analysis for other researchers, offered his perspective on what the desire for “transparency” means for most health researchers.

“when we look at [these algorithms] as a group, we don't understand them very well, we don't even have the tools that we might need. 10% of us might have MatLab on our computers... we have very little skills there. You may tell me what it is, and that's great, that gives me comfort... what we're more concerned with is that... [the methods] are going to last, for our grant, for our work over time, backwards compatibility and forwards use. When we talk about wanting open source, not wanting a black box, I don't think we are particularly savvy enough to know what we want, you could probably tell us anything and we'd probably believe you, at least 90% of us” P10

In addition to this kind of transparency, there was a need to make sense of the data on a more intuitive level. For researcher P2, who was also using her data as a clinician working with obese children, it was important to be able to examine data files individually so that she could relate them to the individual. Potential errors in cut-point categorizations were easier to identify if she could consider her personal observations of the child's behavior and range of movement. Researchers in her group and those of the other interviewees have spent time video-taping their participants and trying to relate that to collected PA data. Researcher P9 expressed confusion about interpreting differences, or lack thereof, between individuals.

“I try to test them all [the sensors], I'll go run with them or something and we'll do a data dump and we'll see if it jumps, I look at all those little spikey things. I feel like sometimes their really inconsistent within ActiLife, like [this one person] is athletic, wakes up every day and runs 15 miles, or whatever, and then I know I am more of a couch potato, and we put our graphs right next to each other, just the x and y axis, and the program puts little labels in there...calibration lines... and it just doesn't make sense, because he would be off the charts compared to my level of activity and it just didn't pass the straight face test, I didn't think. Why aren't his scores showing up higher? It didn't hold face validity for me.” P9

Like consumers of accelerometer devices, she wanted a sense of confidence that the sensors were detecting something meaningful.

“One time I brought a kit of sensors to my kid's first grade classroom and I said okay, shake them, jump all around, pass them to your neighbor, move them, move them, move them, and I took them back to the lab and downloaded them to see what was going on and … they were all over the map, so it was like hmmm..I lack confidence, even though we are supposed to have the gold standard, in my mind, is it that good.” P9

Desire for Specific Techniques
Researchers who were collecting raw data were anticipating doing pattern analysis in the future, but did not have specific plans on how to gain that expertise, other than consult the literature and for some, rely on experts and developers outside their group.

“We'll probably just stay up on the literature, we don't have the statistical modeling to develop our own algorithms; it's really not what I do, I am really an epidemiologist.... I pretty much like those receiver-operator approaches where they test those algorithms and give you some sense of how often it's correct and how often it's not, and the false positives and false negatives.” P

Activity identification was not independently raised as a next step research goal. Additionally, researchers were not asking participants to record activities through experience sampling or activity logs, other than sleep-wake times, sensor wear times, and for one researcher, electronic/screen times (TV, cell phone), further suggesting that activity identification is not a high priority for this set of researchers. Researcher P2 was concerned that activities that are most relevant for her population, obese children undergoing diet and activity interventions, would be difficult to detect (e.g., swimming).

However, P3 opined that an algorithm to classify basic postures of sitting, standing, and walking, would be widely and intensely appealing to health researchers and related that a cell-phone development company had recently presented her with that scenario.

More commonly, researchers expressed interest in analysis of bouts and sedentary breaks.

“... there is no software right now where you can look at breaks in sedentary behavior or breaks in activity behavior. Is it that you sit down a lot or stand up a lot, is it a break or something else. There is no, at the moment, software that is end-user friendly that can deal with it.” P5

Researchers P4 and P5 were transitioning into 24/7 data collection and were interested in detecting sleep/wake transitions as well as sleep actigraphy; sleep data was also of interest to P2.

Few of the researchers were at a point in their research when they could reference a specific new technique, particularly for raw data research, that they wanted to use. P10 specified entropy models. P3 had experience using the Crouter method and related how challenging it was to help other researchers try this as well.

“main question I get...'What do I do with all this data?' and my biggest frustration is when you tell someone, well, I would use the Crouter method, they don't – I don't think it's in the latest Actigraph software, there are a couple of regressions in there. So they have to go program it on their own and it's not particularly easy to do.” P3

Consistent with this remark, P9 recalled how technical members of the research team and colleagues have initially been willing to help try new techniques, but then were unable to actually assist with the required SAS syntax and macros.

Researchers were commonly using one sensing device at a time, along with discrete metrics such as lab-based glucose testing or lifestyle survey instruments. When concurrent sensors were in use, such as heart-rate monitors, synchronizing was not required or was relatively flexible, making sensor drift less of a concern. Analysis was performed on summary data from each device or researchers would use only one data stream at a time. An exception was P3, who had worked with Wockets developers on synchronization issues and was satisfied with the progress that was made.

Researchers varied in their data access patterns. P2, for example, strongly favored recently collected data, because she relied on her personal knowledge of the participants to interpret the data. P8 was conducting multi-year longitudinal research, however, and was linking new summary data to older summary data. P1 described running reanalysis for different publications using different Non-Wear criteria.

Real-time Monitoring
One survey respondent identified wireless upload and real-time monitoring when asked about techniques s/he would like to try:

“I want more cell-phone like devices, where data is automatically and wirelessly uploaded to an easy accessible location ... this is very important for longer-term research I'm doing into providing individualised feedback to participants.”

When asked about real-time data upload and monitoring, several researchers first identified the benefit for intervention studies (not necessarily their own), but quickly agreed that real-time compliance monitoring would be highly valuable (e.g., P1 commented “[that] would be fantastic!”). Some researchers have participants repeat days or the whole week of data collection if there was insufficient data; being able to catch it would shorten this process. Researchers thought they would be able to monitor a web site daily when units were out, but envisioned receiving an email notifying them of an issue. One researcher is able to use the country-sponsored health system to send text message reminders to participants; other researchers would like to be able to do something similar with automatic reminders.

Sharing Data
Several researchers suggested that there is a substantial amount of data being collected that are underutilized due to lack of data handling and analysis expertise on health research teams.

“I can't tell you how many offers I've had in the past year, 'do you want my accelerometer data, I have this data.' P3

“People are expected to use accelerometers now, where it's not really their area, but it's the expectation that that's how you are going to collect your physical activity data, so a lot of people who don't have this expertise, they really could use more help than they are getting from the ActiGraph software and I assume from other software that goes with other monitors.” P8

There were no concerns about IRB approval. P2, a researcher from Ireland, referenced the “Safe Harbor Directive” for sharing data outside of the European Union; to her knowledge, American universities have been largely successful in getting approval under this directive, but sometimes companies run into road blocks.

P5, a researcher from Sweden, emphasized that participants would have to have been informed that their data would be shared or continued to be used, but most researchers felt that there would be little participant resistance, if reasonable steps were taken to protect identity. Genetic information and GPS were seen as bigger red flags for IRBs, and possibly to participants. One researcher referenced the recent FitBit scandal and wondered whether such incidents would begin to introduce more skepticism in participants. However, overall researchers did not express concern about privacy issues.

Two participants, however, were concerned about intellectual property issues. P5 recalled that the latest ActiGraph software asks the user about sharing data and she answered “no,” because she thought it implied that the data would be given without authorship entitlement. P8 also expressed concerns about sharing data.

“I've got to say, I don't see a lot of benefits to that kind of sharing [sharing data]. I'd share algorithms, anything that would make it easier for people to work with their own data, but it takes a lot of time and effort to collect this stuff and I'm not overly keen on just handing it off. On the other hand, if someone proposed doing some kind of pooled study and I would be a co-author and had some voice in it, I'd be happy to contribute. I'm good at giving our data to our students and members of our research team, but we haven't exhausted all the questions that we want to answer with our data. ” P8

The same participant, however had contributed data to the ICAD project, which did a pooled analysis and while it did not share raw data, did share outcome variables. She was pleasantly surprised at how easy it was to supply the data.

In spite of these concerns, there is the strong belief that there are substantial amounts of unpublished data that could be useful.

“The bigger the repository the better, the more diverse kinds of data the better... what can we learn in an adult or youth population and then apply to an older adult population. There may be tweaks, but the more data that we have to test... There's a treasure trove out there and I would be willing to contribute to that.” P4

There was also great interest in sharing Non-Wear algorithms, cut-points, and other analysis techniques. Researchers indicated that they could see themselves using otherwise unpublished techniques in grant applications and publications, if they could easily reference back to the documentation for the work posted on a publicly available site.

“ I think people are open to new algorithms, if you are developing a way to integrate those algorithms and anyone could use them, I think that would be hugely helpful, especially because even in our center, if you take Actigraph data, different groups are using different Non-Wear algorithms, different cut-points, different everything, and if there was a way to make that standard so you could compare across studies, that would be hugely useful.” P3

Researchers agreed that they would be interested in learning more about how other research groups are accomplish data collection and analysis.

“would I be comfortable getting suggestions from others? Oh yeah, if someone said to me, the state of the art is this...I try to look around and see what other people are doing, but I am sure I am missing some stuff.” P9

P10 envisioned in extension of the Wockets project offering Youtube how-to videos and information focused on developing standards across devices. P3 emphasized the importance of sustained partnerships between health researchers and engineers. P3 noted that she advises colleagues “to triple” whatever estimate they receive from engineers for implementing technological tools.

“for other researchers in our department, they are ready to fire their programmers because there are so many bugs and they are behind on their project. It can be a very difficult collaboration because of the different worlds [and different expectations on development timelines].” P3

She suggested that partnerships need at least 6 months to become productive and notes that the challenge with using students for development assistance is that they do not have the long-term time commitment to make real progress. P4 additionally noted that health researchers and engineers also have to establish common ground about the criteria on which they will judge analysis techniques.

“We just speak very different languages. It just takes a long time for engineers and health intervention types to be able to communicate at the same level and also have the same goals. My goals are public health related, increasing physical activity and doing intervention work. I need to do measurement, do good measurement to do good studies and get results, but I feel like sometimes engineers, for better or worse, are much more focused on the minutia and can get caught up in the details. That's a frustration. Sometimes that doesn't match the goals I have. Collecting data, getting algorithms, I'm okay with it being pretty good, I don't need it to be perfect. We can keep developing as we go, but I want initial validation that I can use in the field right away. But I'm very open to learning from engineers. Maybe just starters, the ways to break through the language barriers.” P4

P2 saw a greater need to coordinate research to better achieve a link with health outcomes.

“I think... maybe there is not enough sharing of information and data, it's a shame that – there is the WHO physical activity group, but I think if there could be more coordinated efforts so that actual targets could be met, one group looking at activity for heart health, one group looking at activity for digestion. I think that maybe a lot of studies are looking at activity because they should or the know it's important, but they are not really looking at how activity translates to health.” P2

Two researchers (P6 and P8) referenced overseeing student work and indicated that it was important for them to be able to access data and gain experience with visually inspecting data. Researcher P9 raised the challenge of commuting between campuses and working with outside consultants, when data or analysis software may be accessible by a limited number of machines.

Tools as Software or Service
Researcher P7 noted a preference for “familiar software,” but most researchers assumed that they would be changing software and going outside their research group when they eventually shift to include pattern analysis.

“At the moment, we can deal with different cut-points, when we start doing pattern recognition, we don't plan on developing software to do that, we hope that somebody will have it done. I guess the best bet is the University of Massachusetts and we plan on using their software, with this processing fee, that seems reasonable to me.” P8

Most participants had been doing accelerometry research for an extended period, including longitudinal studies (e.g., P8). As described above, most were satisfied with doing summary count analysis for the immediate future. However, researchers were quick to state that they would be willing to shift to a new software, even to new devices, if data clean-up and analysis could be made more manageable.

“ I don't know what you guys are thinking about charging, but given how long it takes to collect the darn data, paying some on the backend to get it processed strikes me as reasonable... I have done a little work on what it costs to get this data, and it costs a few hundred dollars a subject, so if you are paying another 20-30 dollars to fun a file through some computer program, that's just not inappropriate at all.” P8

3.4. Design Recommendations

The primary objective of SPADES is to lower the barrier to entry for health researchers who are interested in collecting raw physiological data. At least three of the interviewed participants, are at a point where they would like to pursue more complicated analysis methods that take advantage of richer data streams. The other researchers, although beginning to collect raw data as a matter of course, did not express a strong immediate need for raw data pattern analysis tools. There may be a number of reasons for this, including concern about the sensing device requirements and associated participant burden, belief that there is sufficient work yet to be done with summary data alone, and reluctance to consider analysis options before adequate tools for non-mathematicians are available.

However, it was also revealed that there are significant frustrations with existing tools for summary data preparation and analysis. Software that addresses these unmet needs may be sufficiently appealing to serve as an entry point for health researchers. Researchers may be attracted to the system by features such as batch processing and customizable Non-Wear filtering for summary data. Once they become familiar with the platform, they may appreciate opportunities to try “transparent” cut-points and Non-Wear algorithms supplied by other research groups. From there, researchers could experiment with techniques that are gaining traction, such as algorithms for sedentary break analysis. Ultimately, researchers could be encouraged to make feature requests, share algorithms, share case studies, and pool data for co-authored publications.

Although a variety of fields and experiences levels were represented in this study, it is likely that the sampled survey respondents and interview participants are more comfortable with PA data collection and analysis than some members of the long-term target audience. Based on the comments of several participants, it may be assumed that there is a significant subset of researchers who are actively collecting physiological data, but who are not presently equipped to fully and meaningfully use these data. Providing a service where these researchers could seek out partnerships and begin experimenting with analysis methods, perhaps beginning with summary methods, may widen what they are able to do with what is becoming a required data source.

Based on the findings from the survey and interviews, the following features are recommended:

Support for summary data research – import data collected at one minute epochs and provide feature for reducing raw data to summary counts

Visual inspection of data – tools that encourage researchers (and in particular, students) to become familiar with the data they have collected; option to hide segments and columns of data without deleting; flags for errors, Non-Wear, or other automatically determined labels (see awake-sleep) that would still need to be confirmed by researcher

Customizable Non-Wear filtering – ability to do complex filtering such as only participants with at least 4 days of 10+ hours of wear-time, which include at least one weekend day; ability to easily compare how differing definitions of Non-Wear affect results

Batch processing – export variables of interest for all subjects in one file for easy analysis in SAS and SPSS

Sedentary breaks algorithms – frequently mentioned as a potentially promising analysis method that seems like a logical next step after traditional cut-point analysis

Real-time monitoring – email researcher if likely technical errors or non-compliance is compromising data collection; ideally, send text reminders to participants; options for real-time feedback to participants for intervention researchers

Awake-sleep rhythms – as more researchers switch to wrist-based 24/7 data collection, algorithms to suggest likely sleep/wake rhythms (compared with self-report) could be helpful

Emerging interests: posture and sleep actigraphy – open source algorithms for sitting/standing/walking classification and sleep quality

Off-site storage If, as predicted by two researchers, mid-size physical activity studies begin requiring 100+ participants, off-site storage becomes more critical; security standards, but also reasonable access for researchers who may be traveling or working with non-co-located teams

Sharing and partnerships - Encourage the sharing of Non-Wear and other cleanup algorithms, make it easy to investigate different cut-points, make co-authorship expectations clear for shared data, case studies to share best practices and help overcome the language barrier between health researchers and engineers (for example, to give health researchers a feel for how long development of a feature request will likely take), tools that help student developers to pick-up where others left-off to provide continuity, forums or tools to link investigators who are new to accelerometry research with more experienced researchers.