Open Search Server (OSS) is a search engine software developed under the GPL v3 open source licence.
Built using the best open source technologies available, Open Search Server is a stable, high-performance piece of software. It is both a modern search engine and a suite of high-powered full text search algorithms.
Open Search Server runs on Windows 20xx/XP/Vista, MacOS X, Solaris and Linux + Java Virtual Machine.
OSS Engine
This add-on is a native library developed in C++, and a considerable boost to the capacities of Open Search Server. Thanks to an optimised native code, OSS Engine gets exceptional performances from Open Search Server. Enhancements include:
* Pertinence personalisation offers gigantic possibilities
* Document indexation is faster by an order of magnitude
* Improved response times
* Heightened number of possible simultaneous queries
OSS Engine works as an add-on to Open Search Server on Linux, Solaris, Windows 20xx/XP/Vista and MacOS X whether in 32 or 64 bits. It is distributed through the attribution of a proprietary licence.
Features:
- Multi-languages indexing. Documents can be indexed in sixteen languages - Chinese, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish, Turkish.
- Multi-lingual analysers slice sentences into words, then run lemmatisation algorithms on words based on the document's language (singular/plural, gender, conjugated verbs, etc.)
- The crawlers go through web sites and file systems to rapidly and easily build your index.
- Numerous document formats are supported, such as XML, HTML/XHTML, Adobe PDF, Microsoft Word, PowerPoint, OpenOffice, etc.
- The web interface is built around the power offered by the Zkoss framework. It runs with the main Ajax browsers. This RIA-type interface is as comfortable to use as that of a heavy client.
- Easy configuration through a single XML file, which includes fields definition and the indexation options.
- Quick integration thanks to an XML interface via HTTP queries (XML over HTTP).
What is new in this release:
- This developer release unveils new powerful features and some bugfixes.
- The screenshot feature automatically captures screenshots of the Web pages being crawled.
- Search queries are able to return terms from non-stored fields.
- Negative filters are available.
- The Web crawler is able to follow sitemap files.
What is new in version 1.2.1-r987:
- New features:
- 3176150: time/date stamp
- 3186042: Disable not being allowed (robots.txt)
- 3182953: Crawl URL from a database
- 3182950: Pattern and exclusion list deactivable
- 3182097: Adding field boost support
- 3175585: More like this feature
- 3169421: Japanese and Korean support
- 3159477: Identify identical web pages
- 3151757: Upgrade to PDFBox 1.4.x
- 3141193: FTP/FTPS support in the file crawler
- 3141192: SMB/CIFS support in file crawler
- 3034238: Crawler able to log in protected web site
- 3011773: Add Quartz as scheduler service
- 3138603: Upgrade Tomcat to version 6.0.xx
- 3103055: Convert HTML entities
- 3087916: Upgrade to PDFBox 1.2
- 3043692: Torrent Parser
- 3042488: Audio parser
- 2882260: Add a parser for text/plain
- 3010010: RTF parser
- 3038733: Add a shingle filter
- 3036262: Log management
- 3031800: Schema interface
- 3031204: Adding NGram support
- 3008440: Index replication
- 3026212: API and interface for document deletion
- 3023327: Sub domain extraction in Web Crawler
- 2820289: Database crawler
- 3019035: Neglight dynamic url while crawling
- 3017277: Allow wildcard query in the URL browser
- 3016491: Url Exporter
- 3016566: Monitoring API
- 3015939: Cluster collapsing
- 2830490: Size of the index
- 3011847: Score explanation
- 3008633: Possibility to turn off the highlighting
- 2997836: In the returned fields having extra fields from Meta tags
- 2997826: Possibility to index only the specified content
- 2991252: Possibility to index binary file and to add it to a document
- 2982545: Extracting term frequency informations
- 2881385: API to retrieve the available indice
- 2887376: Enhancement for the index page dropping indexes
- 2881388: API to list/create/modify fields in a specified schema
- 2973374: Upgrade to ZKoss 5.0.x
- 2970747: Upgrade Tomcat to version 6.0.26
- 2966139: Statistics lost when OSS restart
- 2964704: Upgrade to Lucene 2.9.x
- 2958015: Add source archive
- 2958005: Upgrade Apache HttpClient library to 4.0.1
- 2956498: Provide a way to send statistics report by email
- 2953803: Upgrade to PDFBox 1.0
- 2953802: Upgrade to POI 3.6
- 2953575: Charset detection should look at meta http-equiv
- 2953524: Specify default charset for parser
- 2929332: Faceting post collapsing
- 2900462: Upgrade POI to 3.5 for xlsx and docs support
- 2900449: Upgrade PDFBox to 0.8
- Bug fixes:
- 3178432: Wrong cron values in the scheduler
- 3104065: File crawler crashes with java.io.EOFException
- 3090248: Statistics configuration lost when adding fields
- 3051308: is not interpreted
- 2881689: Requests.xml fails to rotate on some Windows platforms
- 3019491: NullPointerException > at java.util.regex.Matcher
- 3017481: The web crawler selects the host in alphabetical order
- 3015838: Web crawler problem with UTF-8 BOM encoding
- 2993103: NoClassDefFoundError BouncyCastleProvider
- 2990960: keyword are not highlighted in snippets
- 2982541: Phrase synonyms generate unwanted words
- 2934214: Shifted highlighting on snippet
What is new in version 1.2 Beta:
- More than 50 new features and bugfixes were added.
- An index can be replicated on a remote server.
- An n-grams filter and a shingle filter provide new possibilities, such as a suggestion box, wrong spelling tolerance, and automated topic generation.
- A database crawler supporting join queries and external files was added.
- Several improvements were made to the Web crawler, such as a URL exporter, sub-domain extraction, an exclusion list, manual crawling, and a parameter filter.
- An API and Web interface for monitoring and supervision was added.
- The new audio parser offers the ability to index torrent, MP3/MP4, OGG Vorbis, FLAC, and WMA files.
What is new in version 1.1.2:
- New features:
- Add source archive
- Lucene read only support
- Bug fixes:
- PHP API fail on some query with wrong xml chars
- Issue with statistics aggregation
- Sort functionnality seem to ignore sort order
- Performance issue with large field cache
- Issue with performance of web crawler
- Negative value on web crawler statistics
- Behaviour problem of wildcard function
- duplication of returned field in returned xml
What is new in version 1.1:
- Synonyms support
- Spellcheck support
- Web crawler and file crawler
- Support for additional languages: Romanian, Turkish, Danish, Russian and individual Chinese characters
- OpenDocument Format support
- Management of several indices within a single instance
Comments not found