SAS Text Miner - Detailed Review

Research Tools

SAS Text Miner - Detailed Review Contents
    Add a header to begin generating the table of contents

    SAS Text Miner - Product Overview



    Introduction to SAS Text Miner

    SAS Text Miner is a powerful tool within the SAS software suite, specifically designed to extract valuable insights from large volumes of unstructured text data. Here’s a brief overview of its primary function, target audience, and key features:



    Primary Function

    SAS Text Miner is intended to automate the process of analyzing and extracting meaningful information from vast amounts of text data. This includes documents, emails, web content, customer feedback, and other text-based sources. The software converts unstructured text into structured data, enabling advanced analyses and predictive modeling.



    Target Audience

    This tool is primarily aimed at business analysts, data scientists, and modelers across various industries. It is also beneficial for marketing representatives, customer service representatives, help desk support specialists, academic and medical researchers, product managers, and anyone who needs to analyze large volumes of text to extract information, ideas, and trends.



    Key Features

    • Text Import and Processing: SAS Text Miner can import text from a wide variety of sources, including PDFs, Microsoft Word, HTML, and other formats. It filters, extracts, and converts text into a SAS data set for further analysis. The software also identifies the language of each document and transcodes it to the session encoding format.
    • Advanced Linguistic Capabilities: The tool includes features such as stemming, automatic recognition of multi-word terms, normalization of entities (like dates and currencies), part-of-speech tagging, and extraction of entities like organizations and products.
    • Integration with SAS Enterprise Miner: SAS Text Miner operates within the SAS Enterprise Miner environment, allowing seamless integration of textual data with traditional data mining variables. This enables predictive modeling, classification, and clustering using both structured and unstructured data.
    • High-Performance Text Mining: The software uses high-performance procedures to quickly evaluate large document collections, making it possible to discover essential elements in minutes or seconds.
    • Visual Presentation and Interactive Interface: SAS Text Miner provides a visual presentation of the data mining process, allowing users to drill down to relevant details and explore term connections. The interactive interface enables users to investigate derived topics and fine-tune models.
    • Classification and Clustering: The tool supports automatic classification of documents, clustering of large document collections, and the generation of classification codes from descriptions. It also helps in predicting problem categories and expected time and cost to solve problems.
    • Multi-Language Support: SAS Text Miner supports analysis in multiple languages, including Arabic, Chinese, Czech, Danish, Dutch, English, Finnish, French, German, and many others.

    By leveraging these features, SAS Text Miner helps organizations uncover hidden trends, spot business opportunities, and make fact-based decisions by integrating unstructured text data into their data mining and predictive analysis processes.

    SAS Text Miner - User Interface and Experience



    User Interface of SAS Text Miner

    The user interface of SAS Text Miner is crafted to be user-friendly and flexible, making it accessible and efficient for users to analyze large volumes of text data.

    Ease of Use

    SAS Text Miner features an intuitive, drag-and-drop interface that simplifies the process of text mining. This interface conforms to Windows accessibility standards, ensuring that it is accessible to a wide range of users.

    Key Features

    • The software allows users to dynamically create data sets from files contained in a directory or crawled from the Web, making the import process straightforward.
    • It includes an interactive GUI that enables users to easily identify relevance, modify algorithms, and group materials into meaningful aggregates. This interactive interface helps in visualizing and exploring the connections between terms and topics.


    User Experience

    The overall user experience is enhanced by several key features:

    High-Performance Text Mining

    • The software leverages symmetric multiprocessing (SMP) mode and multicore processors to speed up compute-intensive tasks such as text parsing and singular value decomposition (SVD) generation. This ensures that evaluations can be run quickly, even for large collections of documents.


    Visual Interrogation of Results

    • Users can visually analyze results, explore relationships between terms, and communicate findings effectively. The interface allows for transparent drill-down capabilities, enabling users to present a high-level view of the data and drill down to relevant details.


    Automatic Boolean Rule Generation

    • The software automatically generates Boolean rules that can be used to classify content, which simplifies the categorization process and saves time.


    Term Profiling and Trending

    • Users can evaluate the relevance of terms in a collection and understand usage trends over time. This feature helps in identifying what is currently relevant and what is trending.


    Flexible Entity Options

    • Users have the flexibility to choose from predefined entities, define their own, or create custom entities for fact and event extraction. This customization allows for more accurate and relevant insights.


    Additional Features

    • Document Theme Discovery: The software identifies themes in document collections using integrated document filtering capabilities, which helps in structuring unstructured data into meaningful insights.
    • Multi-Language Support: SAS Text Miner natively supports multiple languages, including Arabic, Chinese, Czech, and many others, making it versatile for global use.
    Overall, the user interface of SAS Text Miner is designed to be intuitive, efficient, and highly interactive, making it easier for business analysts, modelers, and data scientists to extract valuable insights from large volumes of text data.

    SAS Text Miner - Key Features and Functionality



    SAS Text Miner Overview

    SAS Text Miner is a comprehensive text mining software that offers a range of features to analyze and extract valuable insights from unstructured text data. Here are the main features and their functionalities:



    High-Performance Text Mining

    SAS Text Miner allows you to quickly evaluate large document collections using high-performance text mining procedures. This capability enables fast processing of extensive text data, making it possible to discover essential elements and improve model performance in a short amount of time.



    User-Friendly, Flexible Interface

    The software features a user-friendly interface that conforms to Windows accessibility standards. This interface is intuitive and flexible, making it easier for both business analysts and statisticians to perform text mining tasks without extensive technical knowledge.



    Automatic Boolean Rule Generation

    SAS Text Miner can automatically generate Boolean rules, which simplifies the classification of content. This feature helps in categorizing text data based on predefined criteria, streamlining the analysis process.



    Term Profiling and Trending

    The software allows you to evaluate the relevance of terms within a collection and understand usage trends over time. This feature is crucial for identifying key topics, observing how terms change, and determining their significance in the context of the analysis.



    Document Theme Discovery

    SAS Text Miner can identify themes in document collections using integrated document filtering capabilities. This feature helps in distilling key concepts contained in large document collections and analyzing relationships between isolated terms or phrases and documents.



    Visual Interrogation of Results

    The software enables visual analysis of results, allowing you to easily explore relationships between terms and communicate findings effectively. This visual approach facilitates a better understanding of the data and its implications.



    Flexible Entity Options

    You can choose from pre-defined entities or define your own custom entities for fact and event extraction. This flexibility is particularly useful for extracting specific information such as organizations, products, dates, and more.



    Easy Text Importing

    SAS Text Miner provides an interactive interface for easily importing any text document, regardless of its format. This feature supports documents from various sources, including the web, comment fields, books, and other text sources.



    Native Support for Multiple Languages

    The software supports text analysis in multiple languages, including English, Danish, Dutch, Finnish, French, German, Italian, Japanese, Korean, Norwegian Bokmal, Portuguese, Simplified Chinese, Spanish, Swedish, and Traditional Chinese. This multi-language support is based on pre-defined input variables.



    Integration with Structured Data

    SAS Text Miner allows you to merge the structured outputs of text mining with existing structured data. This integration enhances the reliability of predictive analyses by combining insights from both unstructured and structured data sources.



    AI and Machine Learning Integration

    The software leverages machine learning and natural language processing techniques to automate time-consuming manual activities such as theme extraction and key term relationships. These AI-driven capabilities help in quickly discovering essential elements and improving model performance.



    Conclusion

    In summary, SAS Text Miner is equipped with a range of features that make it a powerful tool for analyzing and extracting insights from unstructured text data. Its integration with AI and machine learning enhances its capabilities, making it a valuable asset for predictive modeling and data analysis.

    SAS Text Miner - Performance and Accuracy



    Performance

    SAS Text Miner is known for its high-performance capabilities, particularly with the introduction of the High Performance Text Miner modules in SAS Enterprise Miner version 13.1. Here are some performance highlights:

    Speed and Efficiency

    High Performance Text Miner nodes are designed to run faster using high-speed algorithms and leveraging multi-core and cluster capabilities. This allows for quick evaluations of large document collections, often completing in minutes or seconds.

    Resource Utilization

    The software is optimized to handle memory-intensive activities efficiently, though issues can arise if data is stored on external drives or if there are significant I/O delays.

    User Interface

    The interface is user-friendly and flexible, conforming to Windows accessibility standards, which makes it easier for users to manage and analyze text data.

    Accuracy

    The accuracy of SAS Text Miner is influenced by several factors:

    Algorithmic Accuracy

    The software uses various text mining procedures, including text parsing, filtering, and topic identification, which help in extracting meaningful insights from text data. However, the accuracy can vary based on the quality of the input data and the specific algorithms used. For instance, a comparison with Python’s NLTK showed that both SAS Text Miner and NLTK had relatively low accuracy rates (around 33-34%) when using simple neural networks, but this can be improved with more sophisticated models.

    Customization and Refinement

    SAS Text Miner allows for the use of custom entities, term trend discovery, and interactive GUIs to refine automatically generated rules and topics. This customization can help improve the accuracy of the analysis by aligning it more closely with the specific needs of the project.

    Limitations and Areas for Improvement

    While SAS Text Miner offers several advantages, there are some limitations and areas where improvements can be made:

    Data Preparation

    The software requires careful data preparation, such as ensuring that variables are properly formatted and that unnecessary long field lengths are avoided, which can slow down processing.

    File Structure

    Although the High Performance Text Miner has improved in this regard, older versions of SAS Text Miner required documents to be stored in individual files within a single directory, which could be cumbersome for large datasets.

    User Expertise

    While the interface is user-friendly, it still requires a good understanding of text mining processes and data analysis. Users with less experience might find it challenging to fully leverage the advanced features of the software. In summary, SAS Text Miner offers strong performance and accuracy, especially with its high-performance modules, but it does require careful data preparation and some level of user expertise to maximize its benefits.

    SAS Text Miner - Pricing and Plans



    Pricing Structure of SAS Text Miner



    Overview

    The provided sources do not offer specific details on the pricing plans, tiers, or any free options for SAS Text Miner.



    Sources of Information

    • The official SAS website for SAS Text Miner does not mention pricing details.
    • The white paper on SAS Text Miner also does not include pricing information.
    • Other resources, such as the SAS communities and support pages, focus on the features, functionality, and usage of SAS Text Miner but do not provide pricing details.


    Recommendation

    To obtain accurate pricing information, you would need to contact SAS directly or visit their sales or pricing pages, which are not linked in the provided sources.

    SAS Text Miner - Integration and Compatibility



    Integration with Other Tools

    SAS Text Miner is closely integrated with SAS Enterprise Miner, allowing users to analyze both structured data and unstructured text within a single platform.



    SAS Enterprise Miner

    SAS Text Miner is a component of SAS Enterprise Miner, enabling users to create workflows that combine text analysis with data mining and machine learning algorithms. This integration allows the output from text analysis to feed directly into the data mining and machine learning processes within Enterprise Miner.



    Base SAS and SAS/STAT

    To use SAS Text Miner, Base SAS must be installed, which is included with SAS Enterprise Miner. However, SAS/STAT, which is required for Enterprise Miner, is not included with the Text Miner license.



    Compatibility Across Platforms and Devices

    SAS Text Miner is compatible with several operating systems and configurations:



    Operating Systems

    SAS Text Miner supports Windows 10 (both 32-bit and 64-bit versions) on the x64 chip family. It also supports other operating systems, but the specific support can vary based on the SAS version. For example, it supports Red Hat Enterprise Linux and other server-class systems, but not z/OS or 32-bit Windows platforms.



    Hardware Requirements

    The software requires sufficient disk I/O performance and at least 4 GB of RAM, with 8 GB recommended for large deployments.



    Installation

    To use SAS Text Miner, both SAS and SAS Enterprise Miner must be installed on the same machine. The installation involves selecting the appropriate products during the installation process, including SAS Foundation, SAS Enterprise Miner Workstation Configuration, and SAS Text Miner Workstation Configuration.



    Usage and Accessibility



    GUI and Nodes

    SAS Text Miner operates within the SAS Enterprise Miner GUI, providing specific nodes for text mining tasks such as text import, parsing, filtering, topic generation, and clustering. These nodes help in analyzing and organizing unstructured text data.



    Web Browsers

    While the primary interaction is through the Enterprise Miner client, SAS software may also be accessed via supported web browsers, with the list of supported browsers updated regularly.

    By integrating seamlessly with SAS Enterprise Miner and supporting various operating systems and hardware configurations, SAS Text Miner provides a comprehensive tool for analyzing both structured and unstructured data.

    SAS Text Miner - Customer Support and Resources



    Customer Support Options



    Phone Support

    Users can report critical problems via phone. SAS offers a US toll-free number ( 1-800-727-0025) and a number for the US Headquarters ( 1-919-677-8008).



    Live Chat

    Users can initiate a chat with SAS Technical Support by clicking the blue chat button on the SAS website. This service is available in the US, Canada, and some US territories.



    Email Support

    Users can send an email to support@sas.com with a detailed description of the problem, including product and version information, operating system details, site number, and any troubleshooting steps taken. Attachments can be included to provide additional context.



    Additional Resources



    Documentation and Guides

    SAS offers detailed documentation, such as the SAS Text Miner fact sheet and user guides, which provide in-depth information on using the software. These resources include step-by-step instructions and explanations of various features and functionalities.



    Support Website

    The SAS support website includes a wealth of information, including technical support policies, tips for contacting support, and ways to update existing support requests.



    Accessibility Features

    SAS Text Miner includes accessibility features to ensure the software is user-friendly for all users, conforming to Windows accessibility standards.



    Training and Learning Materials

    Resources like “Getting Started with SAS Text Miner” help new users learn by example, providing scenarios and practical applications to get started quickly.



    Community and Forums

    While not explicitly mentioned in the provided sources, SAS generally has a community and forums where users can share experiences, ask questions, and get help from other users and experts.

    These resources and support options are designed to help users overcome any challenges they might encounter while using SAS Text Miner, ensuring they can fully leverage the software’s capabilities.

    SAS Text Miner - Pros and Cons



    Advantages of SAS Text Miner

    SAS Text Miner offers several significant advantages that make it a valuable tool for analyzing unstructured data:

    High-Performance Text Mining

    SAS Text Miner can quickly evaluate large document collections using high-performance procedures, allowing for rapid analysis even of extensive data sets.

    User-Friendly Interface

    The software features a user-friendly, flexible interface that conforms to Windows accessibility standards, making it easier for users to process and analyze text data.

    Automatic Boolean Rule Generation

    It can automatically generate true or false rules, streamlining the text classification process and saving time on manual activities.

    Term Profiling and Trending

    SAS Text Miner efficiently evaluates the relevance of terms in a collection and tracks their usage trends over time, helping in identifying key topics and phrases.

    Document Theme Discovery

    The software can identify themes in document collections and assign documents to these themes, either exclusively (Text Cluster node) or to multiple themes (Text Topic node), depending on the need.

    Visual Analysis

    It provides visual interrogation of results, allowing users to easily explore relationships between terms and communicate findings effectively.

    Multi-Language Support

    SAS Text Miner supports multiple languages, enabling analysis of text data from diverse sources without language barriers.

    Integration with Other SAS Tools

    The software integrates seamlessly with other SAS components, allowing users to merge discovered text data with existing structured data for more accurate predictive analyses.

    Disadvantages of SAS Text Miner

    While SAS Text Miner is a powerful tool, there are some limitations and challenges to consider:

    Training Requirement

    Using SAS Text Miner effectively requires training, which can be time-consuming, especially for newcomers to text mining and SAS software.

    Time Consumption for Large Data Sets

    Although the software is high-performance, analyzing large quantities of data can still take significant time, even if it is faster than manual methods.

    Potential Bias in Theme Assignment

    The Text Cluster node can assign a document to a single theme, which might ignore other relevant themes present in the document, potentially leading to biased results.

    Compatibility and Stability Issues

    While not specific to SAS Text Miner alone, SAS products can sometimes have stability issues, particularly with different Java versions, which may require frequent system restarts. In summary, SAS Text Miner is a powerful tool for text analysis, offering high-performance capabilities, user-friendly interfaces, and extensive feature sets. However, it requires training, can be time-consuming for large data sets, and may have some compatibility issues.

    SAS Text Miner - Comparison with Competitors



    When Considering SAS Text Miner

    When considering SAS Text Miner in the context of AI-driven research tools, it’s important to highlight its unique features and compare them with those of its competitors.



    Unique Features of SAS Text Miner

    • Advanced Linguistic Capabilities: SAS Text Miner integrates advanced linguistic capabilities, allowing it to automatically read and analyze large volumes of text data. This includes identifying common topics, ideas, and trends, and structuring text into numeric representations for predictive and data mining modeling.
    • Integration with Structured Data: It stands out by consolidating structured and unstructured data in a common environment, providing a more accurate and complete view of the data. This integration enables the creation of descriptive and predictive models that help in spotting opportunities and recognizing trends.
    • High-Performance Procedures: The software leverages multicore processing hardware to expedite compute-intensive text processing tasks, making it efficient for handling large datasets.
    • Interactive Interface: SAS Text Miner offers a visual presentation of the data mining process, allowing users to drill down to relevant details and fine-tune models interactively.


    Competitors and Alternatives



    Altair AI Studio

    • Key Difference: Altair AI Studio focuses more on general AI and machine learning capabilities, including data science and analytics. While it can handle text data, it may not have the same level of specialized text mining features as SAS Text Miner.
    • Unique Feature: Altair AI Studio is known for its user-friendly interface and the ability to integrate with various data sources, making it a versatile tool for different types of analytics.


    SAP HANA Cloud

    • Key Difference: SAP HANA Cloud is a cloud-based platform that offers a broad range of analytics and data management capabilities. It includes text analysis features but is more geared towards enterprise-level data management and real-time analytics.
    • Unique Feature: SAP HANA Cloud excels in real-time data processing and provides a scalable cloud infrastructure, which can be beneficial for large-scale text data analysis.


    Forsta

    • Key Difference: Forsta is primarily focused on market research and customer experience analytics. It uses AI to analyze customer feedback and other forms of unstructured data, but it may not offer the same depth of text mining capabilities as SAS Text Miner.
    • Unique Feature: Forsta is specialized in handling customer feedback data and provides tools for sentiment analysis and customer journey mapping.


    Other Considerations

    • Customization and Expertise: SAS Text Miner allows for significant customization through interactive GUIs, enabling users to modify relevance scores and guide machine learning results with human insight. This is particularly useful for subject-matter experts who need to fine-tune the analysis based on their knowledge.
    • Scalability and Performance: While competitors like SAP HANA Cloud offer scalable cloud solutions, SAS Text Miner’s high-performance procedures and ability to leverage multicore processing make it highly efficient for large-scale text data analysis.

    In summary, SAS Text Miner is a powerful tool specifically designed for advanced text mining and integration with structured data, making it a strong choice for business analysts, modelers, and data scientists. However, depending on your specific needs, alternatives like Altair AI Studio, SAP HANA Cloud, and Forsta may offer different strengths that could be more aligned with your requirements.

    SAS Text Miner - Frequently Asked Questions



    Do my text documents have to be a SAS dataset to use in Text Miner?

    No, they do not. The Text Import node in SAS Text Miner has the capability to import text in various formats, such as Word, Excel, PowerPoint, and PDF files. This flexibility allows you to work with text data from different sources without needing to convert them into a SAS dataset first.



    Can I use my own Stop/Start list in the Parsing Node?

    Yes, you can. You can create your own SAS datasets as Stop or Start lists. The format for these datasets is documented in the Text Miner documentation, and samples are provided in the SAMPSIO sample library.



    Does the order of the nodes make a difference in Text Miner?

    Yes, it does. After the text data is imported, the text parsing node must precede the text filter node. The other text nodes can be added in any order after these initial steps.



    Can I pull all the email addresses out of the text document collection?

    Yes, you can. To do this, turn on the “Find Entities” option in the properties of the Text Parsing Node. When viewing the results of that node’s run, save the “Terms” table. Query this table for all the terms with the role “Internet,” which captures email addresses and URLs.



    Does SAS Text Miner work effectively for all lengths of data?

    Yes, it does. SAS Text Miner can handle data of various lengths, from short Twitter feeds to lengthy documents like 10-page papers. The system is designed to process and analyze text data regardless of its length.



    Can I analyze text data in multiple languages?

    Yes, you can. SAS Text Miner supports a wide range of languages, including Arabic, Chinese, Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hungarian, Indonesian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Spanish, Swedish, Thai, Turkish, and Vietnamese. It can identify each document’s language and transcode it to the session encoding format.



    How does SAS Text Miner integrate with structured data?

    SAS Text Miner tightly integrates text-based information with structured data. This integration allows analysts to obtain a complete view of both structured and unstructured information from a single interface, enhancing analyses and decision-making. The system can merge discovered data from text mining with existing structured data for more accurate predictive modeling.



    What types of text data can SAS Text Miner process?

    SAS Text Miner can process a wide variety of text data, including web content, email messages, news articles, research papers, surveys, and more than 200 office file types. It can also extract, transform, and load textual data from formats such as PDFs, Microsoft Word, extended ASCII text, HTML, and various database formats.



    Can I customize the entities extracted from text inputs?

    Yes, you can. SAS Text Miner allows you to define your own custom entities for fact and event extraction. You can choose from predefined entities or create your own using the SAS Concept Creation add-on. This includes defining multiword terms and custom entities such as specific districts or product codes.



    How does SAS Text Miner help in predictive modeling?

    SAS Text Miner structures text into numeric representations that can be used as inputs to predictive and data mining modeling techniques. This allows for better understanding of customer, service, and product needs and helps in predicting opportunities for timely exploitation. The outputs from text mining can be merged with numeric data from other analytic processes without additional formatting steps.



    What are the key features of the SAS Text Miner interface?

    The SAS Text Miner interface offers several key features, including an interactive interface for importing text from the Web or internal file systems, text parsing to extract terms or phrases, and transformation of data into structured representations. It also provides a visual presentation of the entire data mining process, allowing users to drill down to relevant details and explore term connections.

    SAS Text Miner - Conclusion and Recommendation



    Final Assessment of SAS Text Miner

    SAS Text Miner is a powerful tool in the AI-driven research tools category, specifically designed to analyze and extract valuable insights from large volumes of unstructured text data. Here’s a detailed assessment of its benefits and who would most benefit from using it.



    Key Benefits

    • High-Performance Text Mining: SAS Text Miner allows users to quickly evaluate large document collections using high-performance text mining procedures, making it ideal for handling vast amounts of data efficiently.
    • User-Friendly Interface: The software features a user-friendly, flexible interface that conforms to Windows accessibility standards, making it accessible to a wide range of users.
    • Advanced Text Analysis: It includes capabilities such as automatic Boolean rule generation, term profiling and trending, document theme discovery, and visual interrogation of results. These features help in extracting key themes, identifying highly related phrases, and observing term usage trends over time.
    • Integration with Structured Data: SAS Text Miner seamlessly integrates with SAS Enterprise Miner, enabling the joint evaluation of both structured and unstructured data elements. This integration is crucial for predictive modeling, segmentation, and other data mining activities.


    Who Would Benefit Most

    • Business Analysts and Market Researchers: These professionals can use SAS Text Miner to uncover underlying themes in customer feedback, survey data, and other textual sources, which can be integrated with structured data to improve predictive models.
    • Customer Service and Help Desk Teams: By analyzing customer complaints and comments, these teams can identify common issues, predict problem categories, and optimize response times and costs.
    • Academic and Medical Researchers: Researchers can benefit from clustering large document collections, such as scientific or legal databases, to extract key concepts and trends.
    • Product Managers and Marketing Representatives: These individuals can use the software to detect issues with product packaging, distribution, and formulation, as well as to capture business announcements and predict market trends.


    Overall Recommendation

    SAS Text Miner is highly recommended for organizations and individuals who need to extract valuable insights from large volumes of unstructured text data. Its ability to automate time-consuming manual activities, such as theme extraction and key term relationships, using machine learning and natural language processing techniques, makes it a valuable tool for various industries.

    The software’s integration with SAS Enterprise Miner and its support for multiple languages add to its versatility. It is particularly beneficial for those who need to classify documents, cluster large document collections, and integrate text data with structured data to enhance predictive modeling.

    In summary, SAS Text Miner is a powerful and efficient tool that can significantly enhance the analysis of unstructured text data, making it an excellent choice for anyone looking to gain deeper insights from textual sources.

    Scroll to Top