Dpdata Toolkit#

ai2-kit tool dpdata

This toolkit is a command line wrapper of dpdata to allow user to process DeepMD dataset via command line.

Usage#

This toolkit include the following commands:

Command

Description

Example

Reference

read

Read dataset into memory. This command by itself is useless, you should chain other command after reading data into memory.

ai2-kit tool dpdata read ./path/to/dataset --fmt deepmd/npy

Support wildcard, can be call multiple time

write

Use MultiSystems to merge dataset and write to directory

ai2-kit tool dpdata read ./path/to/dataset --fmt deepmd/npy - write ./path/to/merged_dataset

filter

Use lambda expression to filter dataset by system data.

ai2-kit tool dpdata read ./path/to/dataset --fmt deepmd/npy - filter "lambda x: x['forces'].max() < 10" - write ./path/to/filtered_dataset

set_fparam

add fparam to dataset, can be float or list of float

ai2-kit tool dpdata read ./path/to/dataset --fmt deepmd/npy - set_fparam [0,1] - write ./path/to/filtered_dataset

Those commands are chainable and can be used to process trajectory in a pipeline fashion (separated by -). For more information, please refer to the following examples.

Example#

# read multiple dataset generated by training workflow by wildcard and merge them into a single dataset
# you can also call `read` multiple times to read multiple dataset from different directory
ai2-kit tool dpdata read ./workdir/iters-*/train-deepmd/new_dataset/* --fmt deepmd/npy - write ./merged_dataset  --fmt deepmd/npy

# You can also save data with hdf5 format
ai2-kit tool dpdata read ./workdir/iters-*/train-deepmd/new_dataset/* --fmt deepmd/npy - write ./merged.hdf5 --fmt deepmd/hdf5


# Use lambda expression to filter outlier data
ai2-kit tool dpdata read ./path/to/dataset --fmt deepmd/npy - filter "lambda x: x['forces'].max() < 10" - write ./path/to/filtered_dataset

# Add fparam to dataset
ai2-kit tool dpdata read ./path/to/dataset --fmt deepmd/npy - set_fparam [0,1] - write ./path/to/filtered_dataset