Individual word frequencies (fi) are counted
together with joint frequencies (fij) for all
possible pairs of words, and the corresponding standardised joint
frequencies are calculated:
sij = fij / (fi + fj
- fij),
where fij and fi, fj refer
respectively to joint and individual frequencies of words
i and j in a given vocabulary list, expressed in
units of context in each case.
This simple (Jaccard) coefficient treats joint non-occurrences as irrelevant, which seems to be a suitable procedure in textual analysis. It is, however, indifferent also to the order in which the words in each pair occur, and depends for its values on a sensible choice of context unit being made in reading the text.
The above coefficient has an expected value of
E(sij) = fi . fj / [t(fi + fj) - fi . fj] , since the expected value of fij is fi . fj / t ,
where t is the total number of context-units counted in the text.
As an alternative, you can employ Sokal's matching
coefficient, in which the number of joint non-occurrences is also
included in the numerator and denominator. In the terms already
outlined, this coefficient is:
cij = (fij + t - (fi +
fj - fij)) / t
Several possibilities are offered for the definition of context- units when reading a text. These must be used with some care, to ensure that the context-units chosen are indeed capable of meaningful interpretation, and that they are not so large that almost all of the target words occur together in each unit, losing any discrimination in the analysis:
If the text file contains special characters which should be ignored when making comparisons in Hamlet, enter these in the edit box provided in the options window. Characters used in this way must be chosen to serve this purpose alone, since they must not be confused with normal text and punctuation.
Check the box in the options window if searching for words is to be case-sensitive. If this option is chosen, words in the vocabulary list must also be entered with regard to upper- and lower-case letters if they are not to be missed. Take care: inadvertent choice of case-sensitive searching when the search list has been specified without regard to case can lead to unexpected results.
Raw and standardised joint frequencies are displayed in lower-triangular matrix format, suitably labelled with the corresponding vocabulary list entries. Either matrix can be regarded as a set of similarity measures between pairs of words, and can be submitted to further analysis using Cluster Analysis, Multidimensional Scaling methods or Correspondence Analysis to identify characteristic word clusters or associations of symbols in the original text.
Using HAMLET Joint Frequencies

The most important points to check are :
You will then be asked if you want to carry out further analyses of the matrix of joint occurrences using Cluster Analysis, Multidimensional Scaling or Correspondence Analysis, and prompted to save any of the files which have been generated for later use, to avoid having to repeat the current joint frequencies analysis.