[p2p-research] Automaping of concepts

Ryan Lanham rlanham1963 at gmail.com
Sun Oct 25 22:17:41 CET 2009


A public version of AutoMap from the CASOS Center at Carnegie Mellon
University is available at
http://www.casos.cs.cmu.edu/project/automap

.

AutoMap is a text mining tool that enables the extraction of network
data from texts. AutoMap can extract three types of information:
content analytic (words and frequencies), semantic networks, and
meta-networks.  AutoMap uses parts of speech tagging and proximity
analysis to do computer-assisted Network Text Analysis (NTA). NTA
encodes the links among words in a text and constructs a network of
the linked words.

New Features in the latest v3.0.1C release include:

Redesigned user interface.  The interface has been streamlined to
provide a more intuitive experience.  Included is a quick-launch area
for commonly used preprocessing commands are available to the user
quickly.  The message window provides user feedback and reminders--
such as where you just stored that file.  All of the commands have
been moved to the menu bar to keep users from having to hunt for what
is available.

Additional features include:
- Ability to deduplicate text files by content.
- Copy files based on filename criteria so as to include the subset of
files you care about.
- Extract web pages from a single source, extracting the text, and
putting all the files into a single directory for processing.
- Extracting text content from Microsoft Word 2003 documents, and
Adobe PDF files.
- Ability to change the text font to any font installed on the
computer, to be able to view any foreign language character set.

Supplemental tools have been developed to aid the AutoMap user.  Tools
include:
- Delete list editor.  Able to interactively add or remove terms, to
add terms from a previously generated concept list, to identify
possible misspelled words in the delete list, to add stemmed versions
of terms on the delete list.
- Thesauri editor.  Able to interactively add or remove terms, to
identify misspelled words, to add stemmed versions of terms in the
thesauri, sort based on the number of terms, to merge multiple
thesauri together.
- Concept list viewer.  Able to sort the concept list by frequency or
relative frequency, to compare to previously generated concept lists,
to save concepts to a delete list, to select terms by a minimum or
maximum frequency, to save a subset of the terms into a file.

--
Michael W. Bigrigg, Project Scientist
CASOS Center, Institute for Software Research
Carnegie Mellon University
http://www.cs.cmu.edu/~bigrigg <http://www.cs.cmu.edu/%7Ebigrigg>



-- 
Ryan Lanham
rlanham1963 at gmail.com
Facebook: Ryan_Lanham
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listcultures.org/pipermail/p2presearch_listcultures.org/attachments/20091025/e32eb326/attachment-0001.html>


More information about the p2presearch mailing list