cpdetector

Software Screenshot:
cpdetector
Software Details:
Version: 1.0.10
Upload Date: 15 Apr 15
Developer: Achim Westermann
Distribution Type: Freeware
Downloads: 3

Rating: 2.0/5 (Total Votes: 1)

cpdetector project is a small yet clever framework for codepage detection.

cpdetector is a small yet clever framework for codepage detection that integrates different strategies. It may be used as a library for third party software that accesses textual data over network.

It also includes a best-practice implementation in form of a command line tool that allows sorting and transforming large collections of documents based on their codepage.

Available strategies include: jchardet (exclusion, frequency analysis, and guessing), detection of the HTML charset property, and detection of the XML encoding declaration.

What is a code page?

At first, a textual document is nothing more than sequences of bits. A computer has to decide, how he can display this data in form of characters (which are identified by the computer as numbers).

A code page - which is also known as charset encoding - maps the raw data of a textual document to characters. The original ASCII code page for example only uses 7 bits of an octet (byte) for deciding the character that is represented thus allowing only to map 128 different characters. In the past memory was expensive and computers most often only had registers and busses for 8 bit.

When a mainframe was conceived it had to be decided, which characters it should support. Physicians and mathematicians for example needed special characters for equations. As a result, a computer often shipped with a special codepage.

What is new in this release:

  • This major bugfix version fixes two issues in command-line batch mode.
  • The switch to skip moving undetected documents works now again.
  • No attempt will be made to transcode undetected documents (the latter caused exceptional program flow).

What is new in version 1.0.8:

  • This release is a stability release and fixes the byte order mark detection and incompatibility with OpenJDK. It also requires Java 1.5 now.

Other Software of Developer Achim Westermann

JChart2D
JChart2D

11 May 15

JChart2D
JChart2D

6 May 15

Comments to cpdetector

Comments not found
Add Comment
Turn on images!