• Complain

Hermann Moisl - Cluster Analysis for Corpus Linguistics

Here you can read online Hermann Moisl - Cluster Analysis for Corpus Linguistics full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2015, publisher: Mouton De Gruyter, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Hermann Moisl Cluster Analysis for Corpus Linguistics
  • Book:
    Cluster Analysis for Corpus Linguistics
  • Author:
  • Publisher:
    Mouton De Gruyter
  • Genre:
  • Year:
    2015
  • Rating:
    4 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 80
    • 1
    • 2
    • 3
    • 4
    • 5

Cluster Analysis for Corpus Linguistics: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Cluster Analysis for Corpus Linguistics" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

The rapidly growing volume of digital natural language text and the complexity of data abstracted from it have increasingly rendered traditional corpus linguistic analytical methodology obsolete. This book describes a cluster analytic methodology for generating linguistic hypotheses on the basis of data abstracted from language corpora.

Hermann Moisl: author's other books


Who wrote Cluster Analysis for Corpus Linguistics? Find out the surname, the name of the author of the book and a list of all author's works by series.

Cluster Analysis for Corpus Linguistics — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Cluster Analysis for Corpus Linguistics" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Table of Contents 8 Appendix This Appendix lists software - photo 1
Table of Contents




8 Appendix

This Appendix lists software implementations of the clustering methods presented earlier. The coverage is not exhaustive: only software known to the author to be useful either via direct experience or online reviews is included.

8.1 Cluster analysis facilities in general-purpose statistical packages

Most general-purpose statistics / data analysis packages provide some subset of the standard dimensionality reduction and cluster analysis methods: principal component analysis, factor analysis, multidimensional scaling, k -means clustering, hierarchical clustering, and sometimes others not covered in this book. In addition, they typically provide an extensive range of extremely useful data creation and transformation facilities. A selection of them is listed in alphabetical order below; URLs are given for each and are valid at the time of writing.

8.1.1 Commercial
  • GENSTAT
    http://www.vsni.co.uk/software/genstat
  • MINITAB
    http://www.minitab.com/en-US/products/minitab/
  • NCSS
    http://www.ncss.com/
  • SAS
    http://www.sas.com/
  • SPSS
    http://www-01.ibm.com/software/uk/analytics/spss/
  • STATA
    http://www.stata.com/
  • STATGRAPHICS
    http://www.statgraphics.com/
  • STATISTICA
    http://www.statsoft.com/
  • SYSTAT
    http://www.systat.com/
8.1.2 Freeware
  • CHAMELEON STATISTICS
    http://www.seventh-sense-software.com/chameleon.htm
  • MICROSIRIS
    http://www.microsiris.com/
  • ORIGINLAB
    http://www.originlab.com/
  • PAST
    http://folk.uio.no/ohammer/past/
  • PSPP
    http://www.gnu.org/software/pspp/
  • TANAGRA
    http://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html
    • This is unusual in including the self-organizing map in addition to the standard methods.
  • WINIDAMS
    www.unesco.org/idams/
  • WINSTAT
    http://www.winstat.com/
8.2 Cluster analysis-specific software

The following software is designed specifically for cluster analysis.

8.2.1 Commercial
  • ANTHROPAC
    http://www.analytictech.com/anthropac/apacdesc.htm
    • Principal component analysis, factor analysis, hierarchical, multi-dimensional scaling
  • BMDP
    http://www.statistical-solutions-software.com
    /bmdp-statistical-software/cluster-analysis/
    • Hierarchical, k -means
  • CLUSTAN
    http://www.clustan.com/
    • Hierarchical, k -means
  • GELCOMPAR
    http://www.applied-maths.com/gelcompar-ii
    • Principal component analysis, hierarchical, multidimensional scaling
  • KCS
    http://www.kovcomp.co.uk/mvsp/ .
    • Principal component analysis, hierarchical
  • STATISTIXL
    http://statistixl.software.informer.com/
    • Principal component analysis, factor analysis, hierarchical
  • VISCOVER
    http://www.viscovery.net/
    • Self-organizing map
  • VISIPOINT
    http://www.visipoint.fi/
    • Self-organizing map, Sammons mapping
8.2.2 Freeware
  • CLUSTER 3.0 http://bonsai.hgc.jp/~mdehoon/software/cluster/
    • Hierarchical, k -means, self-organizing map, principal component analysis
  • DATABIONIC ESOM http://databionic-esom.sourceforge.net/
    • Emergent self-organizing map, and extension of the SOM described earlier
  • GENESIS http://genome.tugraz.at/genesisclient/
    genesisclient-description.shtml
    • Principal component analysis, hierarchical, k -means, self-organizing map
  • MICROARRAYS CLUSTER
    http://derisilab.ucsf.edu/microarray/software.html or http://rana.lbl.gov/EisenSoftware.htm
    • Principal component analysis, hierarchical, k -means, self-organizing map
  • MULTIBASE
    http://www.numericaldynamics.com/
    • Principal component analysis, hierarchical
  • OC
    http://www.compbio.dundee.ac.uk/Software/OC/oc.html
    • Hierarchical
  • PERMUTMATRIX
    http://www.lirmm.fr/~caraux/PermutMatrix/
    • Hierarchical
  • SERF CLUSTERS
    http://www.bram.org/serf/Clusters.php
    • Hierarchical
8.3 Programming languages

All the foregoing packages are good, most are excellent, and any corpus linguist who is seriously interested in applying cluster analysis to his or her research can use them with confidence. That corpus linguist should, however, consider learning how to use at least one programming language for this purpose. The packages listed above offer a small subset of the dimensionality reduction and cluster analysis methods currently available in the research literature, and users of them are restricted to this subset; developments of and alternatives to these methods, such as DBSCAN and the many others that were not even mentioned, remain inapplicable. These developments and alternatives have appeared and continue to appear for a reason: to refine cluster analytic methodology. In principle, researchers should be in a position to use the best methodology available in their field, and programming makes the current state of clustering methodology accessible to corpus linguists because it renders implementation of any current or future clustering method feasible. A similar case for programming is made by Gries (2011a).

There are numerous programming languages, and in principle any of them can be used for corpus linguistic applications. In practice, however, two have emerged as the languages of choice for quantitative natural language processing generally: Matlab and R . Both are high-level programming languages in the sense that they provide many of the functions relevant to statistical and mathematical computation as language-native primitives and offer a wide range of excellent graphics facilities for display of results. For any given algorithm, this allows programs to be shorter and less complex than they would be for lower-level, less domain-specific languages like, say, Java or C++, and makes the languages themselves easier to learn.

Matlab ( http://www.mathworks.co.uk/ ) is described by its website as a high-level language and interactive environment for numerical computation, visualization, and programming. It provides numerous and extensive libraries of functions specific to different types of quantitative computation such as signal and image processing, control system design and analysis, and computational finance. One of these libraries is called Math, Statistics, and Optimization, and it contains a larger range of dimensionality reduction and cluster analysis functions than any of the above software packages: principal component analysis, canonical correlation, factor analysis, singular value decomposition, multidimensional scaling, Sammons mapping, hierarchical clustering, k -means, self-organizing map, and Gaussian mixture models. This is a useful gain in coverage, but the real advantage of Matlab over the packages is twofold. On the one hand, Matlab makes it possible for users to contribute application-specific libraries to the collection of language-native ones. Several such contributed libraries exist for cluster analysis, and these substantially expand the range of available methods. Some examples are:

  • D. Corney: Clustering with Matlab
    http://www.dcorney.com/ClusteringMatlab.html
  • J. Abonyi: Clustering and Data Analysis Toolbox
    http://www.mathworks.co.uk/matlabcentral/fileexchange/
Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Cluster Analysis for Corpus Linguistics»

Look at similar books to Cluster Analysis for Corpus Linguistics. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Cluster Analysis for Corpus Linguistics»

Discussion, reviews of the book Cluster Analysis for Corpus Linguistics and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.