
usage: train [-h] [--describe] [-d DEVICE] [--num-workers NUM_WORKERS]
             [-j NUM_THREADS] [--train-images TRAIN_IMAGES]
             [--train-targets TRAIN_TARGETS] [--test-images TEST_IMAGES]
             [--test-targets TEST_TARGETS]
             [--format {auto,coord,csv,star,box}] [--image-ext IMAGE_EXT]
             [-k K_FOLD] [--fold FOLD]
             [--cross-validation-seed CROSS_VALIDATION_SEED]
             [-n NUM_PARTICLES] [--pi PI] [-r RADIUS]
             [--method {PN,GE-KL,GE-binomial,PU}] [--slack SLACK]
             [--autoencoder AUTOENCODER] [--l2 L2]
             [--learning-rate LEARNING_RATE] [--natural]
             [--minibatch-size MINIBATCH_SIZE]
             [--minibatch-balance MINIBATCH_BALANCE] [--epoch-size EPOCH_SIZE]
             [--num-epochs NUM_EPOCHS] [--pretrained] [--no-pretrained]
             [-m MODEL] [--units UNITS] [--dropout DROPOUT] [--bn {on,off}]
             [--pooling POOLING] [--unit-scaling UNIT_SCALING] [--ngf NGF]
             [--save-prefix SAVE_PREFIX] [-o OUTPUT]
             [--test-batch-size TEST_BATCH_SIZE]

Named Arguments


only prints a description of the model, does not train

Default: False

-d, --device

which device to use, set to -1 to force CPU (default: 0)

Default: 0


number of worker processes for data augmentation, if set to <0, automatically uses all CPUs available (default: 0)

Default: 0

-j, --num-threads

number of threads for pytorch, 0 uses pytorch defaults, <0 uses all cores (default: 0)

Default: 0

training data arguments (required)


path to file listing the training images. also accepts directory path from which all images are loaded.


path to file listing the training particle coordinates

test data arguments (optional)


path to file listing the test images. also accepts directory path from which all images are loaded.


path to file listing the testing particle coordinates.

data format arguments (optional)


Possible choices: auto, coord, csv, star, box

file format of the particle coordinates file (default: detect format automatically based on file extension)

Default: “auto”


sets the image extension if loading images from directory. should include “.” before the extension (e.g. .tiff). (default: find all extensions)

Default: “”

cross validation arguments (optional)

-k, --k-fold

option to split the training set into K folds for cross validation (default: not used)

Default: 0


when using K-fold cross validation, sets which fold is used as the heldout test set (default: 0)

Default: 0


random seed for partitioning data into folds (default: 42)

Default: 42

training arguments (required)

-n, --num-particles

instead of setting pi directly, pi can be set by giving the expected number of particles per micrograph (>0). either this parameter or pi must be set.

Default: -1


parameter specifying fraction of data that is expected to be positive

training arguments (optional)

-r, --radius

pixel radius around particle centers to consider positive (default: 3)

Default: 3


Possible choices: PN, GE-KL, GE-binomial, PU

objective function to use for learning the region classifier (default: GE-binomial)

Default: “GE-binomial”


weight on GE penalty (default: 10 for GE-KL, 1 for GE-binomial)

Default: -1


option to augment method with autoencoder. weight on reconstruction error (default: 0)

Default: 0


l2 regularizer on the model parameters (default: 0)

Default: 0.0


learning rate for the optimizer (default: 0.0002)

Default: 0.0002


sample unbiasedly from the data to form minibatches rather than sampling particles and not particles at ratio given by minibatch-balance parameter

Default: False


number of data points per minibatch (default: 256)

Default: 256


fraction of minibatch that is positive data points (default: 0.0625)

Default: 0.0625


number of parameter updates per epoch (default: 1000)

Default: 1000


maximum number of training epochs (default: 10)

Default: 10

model arguments (optional)


by default, topaz train will initialize model parameters from the pretrained parameters if a pretrained model with the same configuration is available (e.g. resnet8 with 64 units). disable this behaviour by setting the –no-pretrained flag

Default: True


Default: True

-m, --model

model type to fit (default: resnet8)

Default: “resnet8”


number of units model parameter (default: 32)

Default: 32


dropout rate model parameter(default: 0.0)

Default: 0.0


Possible choices: on, off

use batch norm in the model (default: on)

Default: “on”


pooling method to use (default: none)


scale the number of units up by this factor every pool/stride layer (default: 2)

Default: 2


scaled number of units per layer in generative model, only used if autoencoder > 0 (default: 32)

Default: 32

output file arguments (optional)


path prefix to save trained models each epoch

-o, --output

destination to write the train/test curve

miscellaneous arguments (optional)


batch size for calculating test set statistics (default: 1)

Default: 1