The main idea of HAMLET II 3.0(c) is to search text files for words or categories in a given vocabulary list, and to count their joint frequencies within any specified context unit, within sentences, or as collocations within a given span of words.
This procedure is applicable when there are good grounds for searching for inter-connections between a number of key words. The latest version now includes a procedure to assist in identifiying these in relation to potential latent topics according to the generative model provided by Latent Dirichlet Allocation.
The benefit of measuring empirical properties of texts is nicely combined with HAMLET's features of graphical visualization. Qualitative and quantitative analysis are integral parts of HAMLET's design. Unlike much other text analysis software, HAMLET II 3.0 provides maximum transparency of the processes involved in a single, user-friendly, interface, leaving the user in complete control.
Individual word frequencies (fi) ,
joint frequencies (fij) for pairs of words (i,j),
both expressed in terms of the chosen unit of context, and the
corresponding standardised joint frequencies sij
= (fij) / (fi + fj
- fij) are organised in a similarities matrix, which can
be submitted to a combination of cluster
analysis and multi-dimensional
scaling to discover significant word-associations.
In addition to the above (Jaccard) coefficient, it is possible to apply Sokal's 'matching coefficient', which takes account also of joint non-occurrences, and the measure of association strength of van Eck and Waltman(2009), otherwise known as the proximity or probabilistic affinity index. Word co-occurrences within specified context units can also be submitted to correspondence analysis, providing further information about usage within a text.
It then becomes possible to compare the results of applying multi-dimensional scaling to matrices of joint frequencies of equivalent vocabulary lists derived from a number of texts, using Procrustean Individual Differences Scaling (PINDIS), or to apply Individual Differences Scaling (INDSCAL) to the matrices themselves. Forrest Young's SUBJSTAT procedure transforming the resulting non-Euclidean 'subject spaces' into arc-distances permits more rigorous analysis of their results. Alternatively, the profiles of occurrences of items of a given search list in a number of different texts can be compared directly by singular value decomposition or correspondence analysis.
Further procedures help to determine the broad characteristics
of word usage in a text:
The unique graphics of
HAMLET II(c) summarise the results of each of
these analyses, for inclusion in other documents and reports. Numerical
results can be saved, if necessary, in CSV format for further
statistical analysis in STATA, Microsoft Excel or R.
HAMLET II 3.0 for Windows(c) is suitable for use with Microsoft Windows XP, Vista, Windows 7, 8, 8.1 & 10. Full documentation is available here .
For running HAMLET II 3.0 for Windows using WINE on free Debian GNU/Linux consult our documentation about Hamlet II on Debian GNU/LINUX!
|Download HAMLET II 3.0||Download documentation||HAMLET II 3.0 tutorial (HTML)|
Originators and sole distributors:Please address all enquiries and report any problems to
Alan Brier Associate Member, ESRC-National Centre for Research Methods, Southampton, UK
Bruno Hopp GESIS - Leibniz-Institut für Sozialwissenschaften, Cologne, Germany