TET PDF IFilter能夠從PDF文檔中提取文本和元數(shù)據(jù),并使其可用于Windows操作系統(tǒng)上的搜索和檢索軟件。這使得PDF文檔在本地桌面,企業(yè)服務器,或Web上能被搜索。TET PDF IFilter基于具有專利的PDFlib文本提取工具箱(TET),它是一個讓開發(fā)者從PDF文檔中可靠提取文本的產(chǎn)品。
TET PDF IFilter extracts text and metadata from PDF documents and makes it available to search and retrieval software on Windows. This allows PDF documents to be searched on the local desktop, a corporate server, or the Web. TET PDF IFilter is based on the patented PDFlib Text Extraction Toolkit (TET), which is a developer product for reliably extracting text from PDF documents.
TET PDF IFilter是微軟IFilter索引接口強大功能的體現(xiàn)。它能夠與所有的搜索和檢索產(chǎn)品一起工作,并支持IFilter接口,例如SharePoint和SQL Server。該產(chǎn)品使用的是特定格式的過濾器程序—稱為IFilter—針對一些特定的文件格式,例如,HTML。TET PDF IFilter是一個旨在為PDF文檔服務的程序。用于搜索文檔的用戶接口可以是Windows Explorer,一個Web或數(shù)據(jù)庫的前端,一個查詢腳本,或一個自定義的應用程序。作為一個用于交互式的搜索,查詢的特有產(chǎn)品,它能被嵌入到標準編程中而不需要任何用戶接口。
基于TET專利技術
PDFlib TET,是基于TET PDF IFilter的產(chǎn)品,它于2002年首次發(fā)布,并被世界范圍內(nèi)的客戶廣泛應用于服務器和桌面環(huán)境中。作為一個獨特的產(chǎn)品,它可用于提取PDF頁面的內(nèi)容和元數(shù)據(jù)來作為原始文本,TET還支持XML格式的文檔內(nèi)容。TET還可以作為一個Adobe Acrobat的免費插件;該插件允許在TET的高質(zhì)量文本中進行交互式測試和提取評估。
獨特優(yōu)勢
TET PDF IFilter提供了以下特有功能:
不僅對頁面內(nèi)容支持索引,而且還支持索引元數(shù)據(jù),書簽,PDF附件,和PDF套包/集合。
甚至可以從那些Acrobat不能打開的PDF文檔中提取文本信息。
支持索引XMP圖像元數(shù)據(jù)。
性能:線程安全,快速穩(wěn)健,支持32位和64位操作系統(tǒng)。
精益獨立的產(chǎn)品且沒有負面影響。
自動進行語言/腳本檢測。
專業(yè)的團隊為您提供高效的技術支持
企業(yè)級PDF搜索
TET PDF IFilter采用的是線程安全的32位和64位本地版本產(chǎn)品。您可以使用TET PDF IFilter和以下產(chǎn)品實現(xiàn)企業(yè)級的PDF搜索解決方案:
Microsoft Office SharePoint Server (MOSS)
Microsoft Search Server 2008 和免費的Search Server 2008 Express
Microsoft SQL Server
Microsoft Exchange Server
TET PDF IFilter可用于所有支持IFilter接口的其他微軟和第三方產(chǎn)品。
桌面PDF搜索
TET PDF IFilter也可以用來實現(xiàn)桌面PDF搜索,例如,以下產(chǎn)品:
TET PDF IFilter是免費提供給非商用的桌面程序使用,其提供了一個方便的基礎平臺用于測試和評估。
TET PDF IFilter is a robust implementation of Microsoft’s IFilter indexing interface. It works with all search and retrieval products which support the IFilter interface, e.g. SharePoint and SQL Server. Such products use format-specific filter programs – called IFilters – for particular file formats, e.g. HTML. TET PDF IFilter is such a program, aimed at PDF documents. The user interface for searching the documents may be the Windows Explorer, a Web or database frontend, a query script, or a custom application. As an alternative to interactive searches, queries can also be submitted programmatically without any user interface.
Based on patented TET technology
PDFlib TET, the basis of TET PDF IFilter, was first released in 2002, and has been used by customers worldwide in server and desktop environments. As an alternative to extracting PDF page contents and metadata as raw text, TET can supply the document contents in XML format. TET is also available as a free plugin for Adobe Acrobat; this plugin allows interactive test and evaluation of TET’s superior text extraction.
Unique advantages
TET PDF IFilter offers the following advantages:
- Indexes not only page content, but also metadata, bookmarks, PDF attachments, and PDF packages/portfolios
- Extracts text even from PDFs where Acrobat fails
- Indexes XMP image metadata
- Performance: thread-safe, fast and robust, 32- and 64-bit
- Lean stand-alone product without side effects
- Automatic language/script detection
- Actively supported by a dedicated team
Enterprise PDF search
TET PDF IFilter is available in fully thread-safe native 32- and 64-bit versions. You can implement enterprise PDF search solutions with TET PDF IFilter and the following products:
- Microsoft Office SharePoint Server (MOSS)
- Microsoft Search Server 2008 and the free Search Server 2008 Express
- Microsoft SQL Server
- Microsoft Exchange Server
TET PDF IFilter can be used with all other Microsoft and third-party products which support the IFilter interface.
Desktop PDF search
TET PDF IFilter can also be used to implement desktop PDF search, e.g. with the following products:
- Windows Desktop Search (WDS): integrated in Windows
- Vista; also available as free add-on for Windows XP
- Windows Indexing Service
TET PDF IFilter is freely available for non-commercial desktop use, which provides a convenient basis for test and evaluation.