BioSignals package

omartynchuk · November 24, 2020, 2:18pm

BioSignals is a package for the Datagrok platform. The goal of the project is to offer an efficient and automated biosignal processing routine. The initial version is based on pyphysio - a python library developed by Andrea Bizzego.

The package reinforces the existing pyhton code with datagroks’ visualization and data processing tools. The pipeline itself is designed with scientific community in mind, standartizing and thus facilitating the usual ECG, EEG, EDA, etc. signal processing workflows. The fusion of manual and automated steps is largely enabled by our interactive viewers, scripting capabilities, detector functions,data augmentation,and a curated collection of scientific methods.

In particular, project’s initial goals are:

automatically read various biosensor file formats
- integrate with the built-in file share browser
provide efficient interactive visualizations for raw biosensor data
- including domain-specific visualizations, such as “head view” for EEG
provide efficient ways for manipulating raw biosensor data (marking regions, etc)
provide a collection of high-performance DSP algorithms
detect type of signals, along with the metadata (sampling rate, etc)
automatically suggest analyses and pipelines applicable to the current dataset
- Example: “Extract step count” for the accelerometry data
visually define pipelines
derive high-level features out of the raw biosensor signal
allow to build predictive models by integrating previously defined pipelines with the Datagrok’s predictive modeling capabilities
- Example: training a model to find “bad” quality segments based on the manually annotated data

Currently, the project is in its early stages and we welcome you to contribute your ideas to this thread.

skalkin · November 24, 2020, 8:45pm

I love this initiative - indeed, our platform has all the necessary pieces, but still, stitching it all together and providing a coherent user experience is no small task. I would suggest to start small, restrict the biosensors to a couple of most commonly used and easily interpretable ones (perhaps ECG?), focus on the well-established and useful analyses and pipelines, build a “killer demo” case that would demonstrate the possibilities of the framework, and grow from there.

Having the right data to work with is very important, I can’t stress this enough. We would need to find a publicly available, anonymized in order to not contain any personally identifiable information (this is very important for multiple reasons) collection of various biosensor files that we can host on our platform. Not only would they serve as a sort of benchmark and a demo of our capabilities, but also our file share browser would serve as a natural starting point - a user would open a folder with data, click on a file, see the analysis suggestions and features that could possibly be derived from it, etc. Any ideas?

mgoodwinmedia · November 25, 2020, 6:22pm

I totally agree. Below are some links to rich data and software repositories from PhysioNet (https://physionet.org/about/) for us to consider.

vkovadlo · May 11, 2021, 4:24pm

Visual design of pipelines

We are introducing a capability to visually create biosensor processing pipelines in an interactive manner. Instead of tinkering with the Python script, at each step simply choose method(s) to apply along with the parameters, and the platform will do the rest. All downstream steps, including graphics, are recalculated on the fly. Currently, the pipelines adhere to the following structure, and there are multiple predefined transformations for each step:

Filtering and preprocessing - increase signal-to-noise ratio (e.g. low-pass filter)
Information extraction - extract signal of different type (e.g. get time intervals between consequtive heartbeats)
Calculation of physiological indicators - calculate indicators with desired segmentation parameters (e.g. Heart Rate)

See it in action in Datagrok.

vkovadlo · May 11, 2021, 4:26pm

User-defined filters

In addition to the already existing DSP functions available for each step, you can write your own using any supported language (R, Python, Julia, JavaScript, Octave). Simply create a script as you would normally do, mark the function with one of the following tags: #filters, #extractors, or #indicators (see pyphysio function categories), and save it.

How to add your custom script:

Go to Functions | Scripts | Actions | New Script
Write your script, and test it by running on your files
Set tag #filters, #estimators or #indicators

Now it is available in the corresponding app section:

vkovadlo · May 11, 2021, 4:28pm

Integration with the PhysioNet

PhysioNet is an invaluable resource for doing research on complex physiologic signals. We’ve made it super easy to load data directly from PhysioNet, along with the annotations.

The following databases are made available:

This functionality is based on WFDB(waveform-database) package, which enables you to download, read, convert, analyze, and share PhysioNet records.

See how it works:

See record 1 of ECG-ID Database in Datagrok.

Integration with the file browser

In addition to working with PhysioNet directly, we have also integrated BioSignals with Datagrok’s file browser. Simply click on the biosensor raw data file in the browser, and the platform will automatically provide a nice interactive preview for it:

File viewer of Physionet records in MIT Format. See how it works:

See it in action on Datagrok

Click on the .attr file to get a preview. For it to work, there should be three files with the same name and the following extensions:

.atr - binary file with annotations (labels that generally refer to specific samples in associated signal files)
.dat - binary file with samples of digitized signal
.hea - short text files that describe the contents of associated signal files