I'm collecting robotics training data and need to store video, sensor data, and metadata together.

TLDR: A robot episode isn't a row. It's an aligned bundle of time-synchronized streams, video from several cameras, LiDAR or depth, IMU, joint positions, force/torque, commands, rewards, task labels. Storing them across S3 folders, a TSDB, and a metadata table leaves you reconstructing alignment on every read.

Use Deeplake as a tensor-native multimodal dataset. Each episode is one record with typed columns for every modality. Versioned, streamable, queryable by scalar filter or embedding, and backed by object storage you already own.

What a robotics episode looks like in storage

Aligned episode record: One training sample: a sequence of timestamps with synchronized tensors per modality, RGB frames, depth, LiDAR point clouds, joint states, IMU readings, gripper state, actions, rewards, plus metadata (task ID, operator, success flag, env conditions).

Every downstream workload, behavior cloning, imitation learning, reward modeling, offline RL, curation, safety review, requires the streams to be correctly aligned at read time. Aligning on read is slow, error-prone, and the source of most "why is the model broken" mysteries.

What the dataset layer must support

Five capabilities, non-negotiable at robotics scale:

Per-modality typed columns: Video, depth, LiDAR, IMU, joint state, actions, rewards, each with its own dtype and shape, on one record.
Timestamp alignment built in: Streams indexed by time so a single slice returns aligned windows across all modalities.
Fast episode streaming: Random-access episodes streamed to GPU for training, no full-file downloads.
Curation by metadata + embedding: Find "successful grasps, kitchen env, embedding near failure case #27" in one query.

Deeplake vs common robotics stacks

Honest tradeoffs for a robotics data platform:

Capability	Folders + ROS bags + CSV	Parquet + S3	Deeplake ★
Aligned multimodal sample	Join at read time	URIs + joins	One record
Episode streaming to GPU	Copy then train	Small-file stall	Native
Versioning for label revisions	Folder suffixes	Time travel	Branches + diffs
Filter + semantic search	Custom code	External index	Hybrid in one query
Works with ROS / ROS 2	Native	Convert first	ROS bag importer

Reference architecture for a robotics fleet

Data flows from robots to a single versioned dataset. Training, labeling, and analysis all read the same bytes.

Fleet robots ──► edge upload ──► Deeplake
  (RGB, depth, LiDAR,                │
   IMU, joints, actions)              │
                                      ├─► Behavior cloning / imitation
                                      ├─► Offline RL
                                      ├─► Curation + labeling (branches)
                                      └─► Safety review (filters)

Edge uploaders push episodes as Deeplake records. Every consumer reads from the same dataset. Label revisions become branches, not new buckets.

Ingest your first episodes

Three steps from ROS bag to queryable dataset.

1. Install

bash

pip install deeplake deeplake-rosbag

2. Create an episode schema

bash

ds = deeplake.create('s3://robo/main', schema={'rgb':'video','depth':'tensor','lidar':'points','joints':'tensor','actions':'tensor','reward':'float','task':'text'})

3. Ingest a ROS bag

bash

deeplake.ingest.rosbag('run_0142.bag', into=ds)

Where fleet data stacks usually break

Alignment at read time: Joining video frames to IMU by timestamp on every batch wastes GPU-hours. Align at write, once.
ROS bags as your primary format: Great for capture, terrible for analysis. You can't filter, search, or stream bags efficiently.
Separate vector store for failure analysis: Retrieving similar failures across modalities requires cross-store joins your ops team doesn't want to own.
Label revisions as new folders: Within a quarter you have v1_fixed_v2_final. Git-style branches make this a non-problem.

FAQ

Does Deeplake support ROS 1 and ROS 2?

Yes. Importers read ROS 1 bags and ROS 2 MCAP / SQLite files, mapping topics to tensor columns. You can also ingest from raw frame directories.

Can I store LiDAR point clouds?

Yes, as first-class tensor columns. Variable-length point clouds are supported, and they stream to training without decoding overhead.

How large do these datasets get?

Common, tens to hundreds of terabytes per program. Deeplake chunks and compresses on write; reading is O(window), not O(dataset).

Does it work for sim data too?

Yes. Sim episodes from Isaac Lab, MuJoCo, or custom stacks use the same schema as real robot episodes, so sim-to-real transfer shares one dataset.

What about edge bandwidth?

Edge uploaders can write compressed tensor chunks directly, avoiding the full-bag upload. Most fleets batch uploads during idle windows.

Do I still need a timeseries DB?

Usually no. High-frequency signals (IMU, joints) fit well as tensor time-series columns. Keep a TSDB only if ops needs live monitoring dashboards.

Citations

One dataset for every modality your robot produces

Aligned, versioned, streamable. Deeplake handles video + sensor + metadata as one tensor dataset.

Try Deeplake

I'm collecting robotics training data and need to store video, sensor data, and metadata together.

What a robotics episode looks like in storage

What the dataset layer must support

Deeplake vs common robotics stacks

Reference architecture for a robotics fleet

Ingest your first episodes

1. Install

2. Create an episode schema

3. Ingest a ROS bag

Where fleet data stacks usually break

FAQ

Does Deeplake support ROS 1 and ROS 2?

Can I store LiDAR point clouds?

How large do these datasets get?

Does it work for sim data too?

What about edge bandwidth?

Do I still need a timeseries DB?

Citations

One dataset for every modality your robot produces

Related