Train
usage: train [-h] [--describe] [-d DEVICE] [--num-workers NUM_WORKERS]
[-j NUM_THREADS] [--train-images TRAIN_IMAGES]
[--train-targets TRAIN_TARGETS] [--test-images TEST_IMAGES]
[--test-targets TEST_TARGETS]
[--format {auto,coord,csv,star,box}] [--image-ext IMAGE_EXT]
[-k K_FOLD] [--fold FOLD]
[--cross-validation-seed CROSS_VALIDATION_SEED]
[-n NUM_PARTICLES] [--pi PI] [-r RADIUS]
[--method {PN,GE-KL,GE-binomial,PU}] [--slack SLACK]
[--autoencoder AUTOENCODER] [--l2 L2]
[--learning-rate LEARNING_RATE] [--natural]
[--minibatch-size MINIBATCH_SIZE]
[--minibatch-balance MINIBATCH_BALANCE] [--epoch-size EPOCH_SIZE]
[--num-epochs NUM_EPOCHS] [--pretrained] [--no-pretrained]
[-m MODEL] [--units UNITS] [--dropout DROPOUT] [--bn {on,off}]
[--pooling POOLING] [--unit-scaling UNIT_SCALING] [--ngf NGF]
[--save-prefix SAVE_PREFIX] [-o OUTPUT]
[--test-batch-size TEST_BATCH_SIZE]
Named Arguments
- --describe
only prints a description of the model, does not train
Default: False
- -d, --device
which device to use, set to -1 to force CPU (default: 0)
Default: 0
- --num-workers
number of worker processes for data augmentation, if set to <0, automatically uses all CPUs available (default: 0)
Default: 0
- -j, --num-threads
number of threads for pytorch, 0 uses pytorch defaults, <0 uses all cores (default: 0)
Default: 0
training data arguments (required)
- --train-images
path to file listing the training images. also accepts directory path from which all images are loaded.
- --train-targets
path to file listing the training particle coordinates
test data arguments (optional)
- --test-images
path to file listing the test images. also accepts directory path from which all images are loaded.
- --test-targets
path to file listing the testing particle coordinates.
data format arguments (optional)
- --format
Possible choices: auto, coord, csv, star, box
file format of the particle coordinates file (default: detect format automatically based on file extension)
Default: “auto”
- --image-ext
sets the image extension if loading images from directory. should include “.” before the extension (e.g. .tiff). (default: find all extensions)
Default: “”
cross validation arguments (optional)
- -k, --k-fold
option to split the training set into K folds for cross validation (default: not used)
Default: 0
- --fold
when using K-fold cross validation, sets which fold is used as the heldout test set (default: 0)
Default: 0
- --cross-validation-seed
random seed for partitioning data into folds (default: 42)
Default: 42
training arguments (required)
- -n, --num-particles
instead of setting pi directly, pi can be set by giving the expected number of particles per micrograph (>0). either this parameter or pi must be set.
Default: -1
- --pi
parameter specifying fraction of data that is expected to be positive
training arguments (optional)
- -r, --radius
pixel radius around particle centers to consider positive (default: 3)
Default: 3
- --method
Possible choices: PN, GE-KL, GE-binomial, PU
objective function to use for learning the region classifier (default: GE-binomial)
Default: “GE-binomial”
- --slack
weight on GE penalty (default: 10 for GE-KL, 1 for GE-binomial)
Default: -1
- --autoencoder
option to augment method with autoencoder. weight on reconstruction error (default: 0)
Default: 0
- --l2
l2 regularizer on the model parameters (default: 0)
Default: 0.0
- --learning-rate
learning rate for the optimizer (default: 0.0002)
Default: 0.0002
- --natural
sample unbiasedly from the data to form minibatches rather than sampling particles and not particles at ratio given by minibatch-balance parameter
Default: False
- --minibatch-size
number of data points per minibatch (default: 256)
Default: 256
- --minibatch-balance
fraction of minibatch that is positive data points (default: 0.0625)
Default: 0.0625
- --epoch-size
number of parameter updates per epoch (default: 1000)
Default: 1000
- --num-epochs
maximum number of training epochs (default: 10)
Default: 10
model arguments (optional)
- --pretrained
by default, topaz train will initialize model parameters from the pretrained parameters if a pretrained model with the same configuration is available (e.g. resnet8 with 64 units). disable this behaviour by setting the –no-pretrained flag
Default: True
- --no-pretrained
Default: True
- -m, --model
model type to fit (default: resnet8)
Default: “resnet8”
- --units
number of units model parameter (default: 32)
Default: 32
- --dropout
dropout rate model parameter(default: 0.0)
Default: 0.0
- --bn
Possible choices: on, off
use batch norm in the model (default: on)
Default: “on”
- --pooling
pooling method to use (default: none)
- --unit-scaling
scale the number of units up by this factor every pool/stride layer (default: 2)
Default: 2
- --ngf
scaled number of units per layer in generative model, only used if autoencoder > 0 (default: 32)
Default: 32
output file arguments (optional)
- --save-prefix
path prefix to save trained models each epoch
- -o, --output
destination to write the train/test curve
miscellaneous arguments (optional)
- --test-batch-size
batch size for calculating test set statistics (default: 1)
Default: 1