Dpdata Toolkit#
ai2-kit tool dpdata
This toolkit is a command line wrapper of dpdata to allow user to process DeepMD dataset via command line.
Usage#
This toolkit include the following commands:
Command |
Description |
Example |
Reference |
---|---|---|---|
read |
Read dataset into memory. This command by itself is useless, you should chain other command after reading data into memory. |
|
Support wildcard, can be call multiple time |
write |
Use MultiSystems to merge dataset and write to directory |
|
|
filter |
Use lambda expression to filter dataset by system data. |
|
|
set_fparam |
add |
|
Those commands are chainable and can be used to process trajectory in a pipeline fashion (separated by -
). For more information, please refer to the following examples.
Example#
# read multiple dataset generated by training workflow by wildcard and merge them into a single dataset
# you can also call `read` multiple times to read multiple dataset from different directory
ai2-kit tool dpdata read ./workdir/iters-*/train-deepmd/new_dataset/* --fmt deepmd/npy - write ./merged_dataset --fmt deepmd/npy
# You can also save data with hdf5 format
ai2-kit tool dpdata read ./workdir/iters-*/train-deepmd/new_dataset/* --fmt deepmd/npy - write ./merged.hdf5 --fmt deepmd/hdf5
# Use lambda expression to filter outlier data
ai2-kit tool dpdata read ./path/to/dataset --fmt deepmd/npy - filter "lambda x: x['forces'].max() < 10" - write ./path/to/filtered_dataset
# Add fparam to dataset
ai2-kit tool dpdata read ./path/to/dataset --fmt deepmd/npy - set_fparam [0,1] - write ./path/to/filtered_dataset