Train

usage: train [-h] [--describe] [-d DEVICE] [--num-workers NUM_WORKERS]
             [-j NUM_THREADS] [--train-images TRAIN_IMAGES]
             [--train-targets TRAIN_TARGETS] [--test-images TEST_IMAGES]
             [--test-targets TEST_TARGETS]
             [--format {auto,coord,csv,star,box}] [--image-ext IMAGE_EXT]
             [-k K_FOLD] [--fold FOLD]
             [--cross-validation-seed CROSS_VALIDATION_SEED]
             [-n NUM_PARTICLES] [--pi PI] [-r RADIUS]
             [--method {PN,GE-KL,GE-binomial,PU}] [--slack SLACK]
             [--autoencoder AUTOENCODER] [--l2 L2]
             [--learning-rate LEARNING_RATE] [--natural]
             [--minibatch-size MINIBATCH_SIZE]
             [--minibatch-balance MINIBATCH_BALANCE] [--epoch-size EPOCH_SIZE]
             [--num-epochs NUM_EPOCHS] [--pretrained] [--no-pretrained]
             [-m MODEL] [--units UNITS] [--dropout DROPOUT] [--bn {on,off}]
             [--pooling POOLING] [--unit-scaling UNIT_SCALING] [--ngf NGF]
             [--save-prefix SAVE_PREFIX] [-o OUTPUT]
             [--test-batch-size TEST_BATCH_SIZE]

Named Arguments

--describe

only prints a description of the model, does not train

Default: False

-d, --device

which device to use, set to -1 to force CPU (default: 0)

Default: 0

--num-workers

number of worker processes for data augmentation, if set to <0, automatically uses all CPUs available (default: 0)

Default: 0

-j, --num-threads

number of threads for pytorch, 0 uses pytorch defaults, <0 uses all cores (default: 0)

Default: 0

training data arguments (required)

--train-images: path to file listing the training images. also accepts directory path from which all images are loaded.
--train-targets: path to file listing the training particle coordinates

test data arguments (optional)

--test-images: path to file listing the test images. also accepts directory path from which all images are loaded.
--test-targets: path to file listing the testing particle coordinates.

data format arguments (optional)

--format

Possible choices: auto, coord, csv, star, box

file format of the particle coordinates file (default: detect format automatically based on file extension)

Default: “auto”

--image-ext

sets the image extension if loading images from directory. should include “.” before the extension (e.g. .tiff). (default: find all extensions)

Default: “”

cross validation arguments (optional)

-k, --k-fold

option to split the training set into K folds for cross validation (default: not used)

Default: 0

--fold

when using K-fold cross validation, sets which fold is used as the heldout test set (default: 0)

Default: 0

--cross-validation-seed

random seed for partitioning data into folds (default: 42)

Default: 42

training arguments (required)

-n, --num-particles

instead of setting pi directly, pi can be set by giving the expected number of particles per micrograph (>0). either this parameter or pi must be set.

Default: -1

--pi

parameter specifying fraction of data that is expected to be positive

training arguments (optional)

-r, --radius

pixel radius around particle centers to consider positive (default: 3)

Default: 3

--method

Possible choices: PN, GE-KL, GE-binomial, PU

objective function to use for learning the region classifier (default: GE-binomial)

Default: “GE-binomial”

--slack

weight on GE penalty (default: 10 for GE-KL, 1 for GE-binomial)

Default: -1

--autoencoder

option to augment method with autoencoder. weight on reconstruction error (default: 0)

Default: 0

--l2

l2 regularizer on the model parameters (default: 0)

Default: 0.0

--learning-rate

learning rate for the optimizer (default: 0.0002)

Default: 0.0002

--natural

sample unbiasedly from the data to form minibatches rather than sampling particles and not particles at ratio given by minibatch-balance parameter

Default: False

--minibatch-size

number of data points per minibatch (default: 256)

Default: 256

--minibatch-balance

fraction of minibatch that is positive data points (default: 0.0625)

Default: 0.0625

--epoch-size

number of parameter updates per epoch (default: 1000)

Default: 1000

--num-epochs

maximum number of training epochs (default: 10)

Default: 10

model arguments (optional)

--pretrained

by default, topaz train will initialize model parameters from the pretrained parameters if a pretrained model with the same configuration is available (e.g. resnet8 with 64 units). disable this behaviour by setting the –no-pretrained flag

Default: True

--no-pretrained

Default: True

-m, --model

model type to fit (default: resnet8)

Default: “resnet8”

--units

number of units model parameter (default: 32)

Default: 32

--dropout

dropout rate model parameter(default: 0.0)

Default: 0.0

--bn

Possible choices: on, off

use batch norm in the model (default: on)

Default: “on”

--pooling

pooling method to use (default: none)

--unit-scaling

scale the number of units up by this factor every pool/stride layer (default: 2)

Default: 2

--ngf

scaled number of units per layer in generative model, only used if autoencoder > 0 (default: 32)

Default: 32

output file arguments (optional)

--save-prefix: path prefix to save trained models each epoch
-o, --output: destination to write the train/test curve

miscellaneous arguments (optional)

--test-batch-size

batch size for calculating test set statistics (default: 1)

Default: 1