LitArk » Books » Home and family

Hermann Moisl - Cluster Analysis for Corpus Linguistics

Here you can read online Hermann Moisl - Cluster Analysis for Corpus Linguistics full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2015, publisher: Mouton De Gruyter, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Cluster Analysis for Corpus Linguistics
Author:
Hermann Moisl
Publisher:
Mouton De Gruyter
Genre:
Books / Home and family
Year:
2015
Rating:
4 / 5
Favourites:
Add to favourites
Your mark:
- 80
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Cluster Analysis for Corpus Linguistics: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Cluster Analysis for Corpus Linguistics" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

The rapidly growing volume of digital natural language text and the complexity of data abstracted from it have increasingly rendered traditional corpus linguistic analytical methodology obsolete. This book describes a cluster analytic methodology for generating linguistic hypotheses on the basis of data abstracted from language corpora.

Hermann Moisl: author's other books

Who wrote Cluster Analysis for Corpus Linguistics? Find out the surname, the name of the author of the book and a list of all author's works by series.

Cluster Analysis for Corpus Linguistics — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Cluster Analysis for Corpus Linguistics" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Table of Contents

8 Appendix

This Appendix lists software implementations of the clustering methods presented earlier. The coverage is not exhaustive: only software known to the author to be useful either via direct experience or online reviews is included.

8.1 Cluster analysis facilities in general-purpose statistical packages

Most general-purpose statistics / data analysis packages provide some subset of the standard dimensionality reduction and cluster analysis methods: principal component analysis, factor analysis, multidimensional scaling, k -means clustering, hierarchical clustering, and sometimes others not covered in this book. In addition, they typically provide an extensive range of extremely useful data creation and transformation facilities. A selection of them is listed in alphabetical order below; URLs are given for each and are valid at the time of writing.

8.1.1 Commercial

GENSTAT
http://www.vsni.co.uk/software/genstat
MINITAB
http://www.minitab.com/en-US/products/minitab/
NCSS
http://www.ncss.com/
SAS
http://www.sas.com/
SPSS
http://www-01.ibm.com/software/uk/analytics/spss/
STATA
http://www.stata.com/
STATGRAPHICS
http://www.statgraphics.com/
STATISTICA
http://www.statsoft.com/
SYSTAT
http://www.systat.com/

8.1.2 Freeware

CHAMELEON STATISTICS
http://www.seventh-sense-software.com/chameleon.htm
MICROSIRIS
http://www.microsiris.com/
ORIGINLAB
http://www.originlab.com/
PAST
http://folk.uio.no/ohammer/past/
PSPP
http://www.gnu.org/software/pspp/
TANAGRA
http://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html
- This is unusual in including the self-organizing map in addition to the standard methods.
WINIDAMS
www.unesco.org/idams/
WINSTAT
http://www.winstat.com/

8.2 Cluster analysis-specific software

The following software is designed specifically for cluster analysis.

8.2.1 Commercial

ANTHROPAC
http://www.analytictech.com/anthropac/apacdesc.htm
- Principal component analysis, factor analysis, hierarchical, multi-dimensional scaling
BMDP
http://www.statistical-solutions-software.com
/bmdp-statistical-software/cluster-analysis/
- Hierarchical, k -means
CLUSTAN
http://www.clustan.com/
- Hierarchical, k -means
GELCOMPAR
http://www.applied-maths.com/gelcompar-ii
- Principal component analysis, hierarchical, multidimensional scaling
KCS
http://www.kovcomp.co.uk/mvsp/ .
- Principal component analysis, hierarchical
STATISTIXL
http://statistixl.software.informer.com/
- Principal component analysis, factor analysis, hierarchical
VISCOVER
http://www.viscovery.net/
- Self-organizing map
VISIPOINT
http://www.visipoint.fi/
- Self-organizing map, Sammons mapping

8.2.2 Freeware

CLUSTER 3.0 http://bonsai.hgc.jp/~mdehoon/software/cluster/
- Hierarchical, k -means, self-organizing map, principal component analysis
DATABIONIC ESOM http://databionic-esom.sourceforge.net/
- Emergent self-organizing map, and extension of the SOM described earlier
GENESIS http://genome.tugraz.at/genesisclient/
genesisclient-description.shtml
- Principal component analysis, hierarchical, k -means, self-organizing map
MICROARRAYS CLUSTER
http://derisilab.ucsf.edu/microarray/software.html or http://rana.lbl.gov/EisenSoftware.htm
- Principal component analysis, hierarchical, k -means, self-organizing map
MULTIBASE
http://www.numericaldynamics.com/
- Principal component analysis, hierarchical
OC
http://www.compbio.dundee.ac.uk/Software/OC/oc.html
- Hierarchical
PERMUTMATRIX
http://www.lirmm.fr/~caraux/PermutMatrix/
- Hierarchical
SERF CLUSTERS
http://www.bram.org/serf/Clusters.php
- Hierarchical

8.3 Programming languages

All the foregoing packages are good, most are excellent, and any corpus linguist who is seriously interested in applying cluster analysis to his or her research can use them with confidence. That corpus linguist should, however, consider learning how to use at least one programming language for this purpose. The packages listed above offer a small subset of the dimensionality reduction and cluster analysis methods currently available in the research literature, and users of them are restricted to this subset; developments of and alternatives to these methods, such as DBSCAN and the many others that were not even mentioned, remain inapplicable. These developments and alternatives have appeared and continue to appear for a reason: to refine cluster analytic methodology. In principle, researchers should be in a position to use the best methodology available in their field, and programming makes the current state of clustering methodology accessible to corpus linguists because it renders implementation of any current or future clustering method feasible. A similar case for programming is made by Gries (2011a).

There are numerous programming languages, and in principle any of them can be used for corpus linguistic applications. In practice, however, two have emerged as the languages of choice for quantitative natural language processing generally: Matlab and R . Both are high-level programming languages in the sense that they provide many of the functions relevant to statistical and mathematical computation as language-native primitives and offer a wide range of excellent graphics facilities for display of results. For any given algorithm, this allows programs to be shorter and less complex than they would be for lower-level, less domain-specific languages like, say, Java or C++, and makes the languages themselves easier to learn.

Matlab ( http://www.mathworks.co.uk/ ) is described by its website as a high-level language and interactive environment for numerical computation, visualization, and programming. It provides numerous and extensive libraries of functions specific to different types of quantitative computation such as signal and image processing, control system design and analysis, and computational finance. One of these libraries is called Math, Statistics, and Optimization, and it contains a larger range of dimensionality reduction and cluster analysis functions than any of the above software packages: principal component analysis, canonical correlation, factor analysis, singular value decomposition, multidimensional scaling, Sammons mapping, hierarchical clustering, k -means, self-organizing map, and Gaussian mixture models. This is a useful gain in coverage, but the real advantage of Matlab over the packages is twofold. On the one hand, Matlab makes it possible for users to contribute application-specific libraries to the collection of language-native ones. Several such contributed libraries exist for cluster analysis, and these substantially expand the range of available methods. Some examples are:

D. Corney: Clustering with Matlab
http://www.dcorney.com/ClusteringMatlab.html
J. Abonyi: Clustering and Data Analysis Toolbox
http://www.mathworks.co.uk/matlabcentral/fileexchange/

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Cluster Analysis for Corpus Linguistics»

Look at similar books to Cluster Analysis for Corpus Linguistics. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Peter Crosthwaite

Data-Driven Learning for the Next Generation: Corpora and DDL for Pre-tertiary Learners

Niladri Sekhar Dash

Language Corpora Annotation and Processing

Ruth E Page

Rethinking Language, Text and Context: Interdisciplinary Research in Stylistics in Honour of Michael Toolan

Darren Quick

Big Digital Forensic Data: Volume 1: Data Reduction Framework and Selective Imaging

Burkette

Language and classification : meaning-making in the classification and categorization of ceramics

Margarita Alonso-Ramos

Spanish Learner Corpus Research: Current trends and future perspectives

Evangelia Adamou

A Corpus-Driven Approach to Language Contact: Endangered Languages in a Comparative Perspective

Alejandro Alcaraz Sintes

Diachrony and Synchrony in English Corpus Linguistics

Meng Ji

Corpus Methodologies Explained: An empirical approach to translation studies

George K. Mikros

Sequences in Language and Text

Christian Jones

Corpus Linguistics for Grammar: A guide for research

James Pustejovsky

Natural Language Annotation for Machine Learning: A guide to corpus-building for applications

Reviews about «Cluster Analysis for Corpus Linguistics»

Discussion, reviews of the book Cluster Analysis for Corpus Linguistics and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.