Contributing
============

You're welcome to contribute.

1. Fork the repository on GitHub.
2. Clone the forked repository into a local directory:
   ``git clone my-repository-url``
3. Create a new branch: ``git checkout -b my-new-feature``
4. Commit your changes: ``git commit -a``
5. Push to the branch: ``git push origin my-new-feature``
6. Submit a pull request on GitHub.

Source code packages
--------------------

``theanolm.commands`` package contains the main scripts for launching the
subcommands.

``theanolm.network`` package contains ``Network`` class, which constructs the
network from layer objects and stores the neural network state (parameters).
Each layer type is implemented in its own class that derives from
``BasicLayer``. These classes specify the layer parameters and the mathematical
structure using symbolic variables.

``theanolm.parsing`` package contains classes for iterating text and converting
it to mini-batches.

``theanolm.training`` package contains ``Trainer`` class, which performs the
training iterations. It is responsible for cross-validation and learning rate
adjustment. It uses one of the optimization classes derived from
``BasicOptimizer`` to compute the gradients and adjust the network parameters.

``theanolm.scoring`` package contains the ``TextScorer`` class for scoring
sentences and ``LatticeDecoder`` class for decoding word lattices.
``TextScorer`` is used both for cross-validation during training and by the
score command for evaluating text.

``theanolm.textsampler.TextSampler`` class is used by the sample command for
generating text.

Neural network structure
------------------------

A ``Network`` object contains tensors ``input_word_ids``, ``input_class_ids``,
and ``mask`` that represent the mini-batch input of the network, i.e. a set of
n word sequences, where n is the batch size. These symbolic variables represent
two-dimensional matrices. The first dimension is the time step, i.e. the index
of a word inside a sequence, and the second dimension is the sequence. The mask
indicates which elements are past the sequence end; the output will be ignored
if the corresponding mask value is zero. Theano functions that utilize the
network have these tensors as inputs. Their values will be read from a text file
by a ``BatchIterator``.

Layers receive a list of input layers in the constructor. The constructor
creates the initial values of the layer parameters. Every layer implements the
``create_structure()`` method that describe its output, given its parameters and
the output of its input layers.

The ``Network`` constructs the layer objects. First layer object is a
``NetworkInput``, which is not a real layer, but just provides in its output
either the word ID or class ID matrix. The first layer following a
``NetworkInput`` should be a ``ProjectionLayer``. It maps the integer word IDs
into floating point vectors. Thus the projection layer and all the subsequent
layers output a three-dimensional tensor, where the third dimension is the
activation vector.

.. image:: images/batch-processing.png