Skip to Main Content
Idea Portal
Categories Usability
Created by Sebastian Gil Haenelt
Created on Jun 14, 2024

Improve Heuristic for Detecting UTF-8 Encoded Text Files Without BOM in WinCube Viewer

Description: The current WinCube Viewer displays TXT files with UTF-8 encoded umlauts incorrectly. The viewer mistakenly identifies these files as "Unknown (ANSI 8)" instead of UTF-8. Other common editors, like Notepad++ and the built-in Windows Notepad, correctly recognise these files as UTF-8 without a Byte Order Mark (BOM). This suggests that the heuristic used by the WinCube Viewer to detect the file format needs improvement.

Problem Statement: When displaying TXT documents with UTF-8 encoded umlauts in the WinCube Viewer, incorrect characters are shown. This discrepancy arises because the viewer does not correctly recognise the character encoding, leading to improper display. This is particularly problematic when documents need to be displayed consistently across different environments.

Proposed Solution: Implement an improved heuristic for detecting UTF-8 encoded text files without a BOM in the WinCube Viewer, so they are correctly identified and displayed. The logic used by common editors like Notepad++ and Windows Notepad should be used as a reference.

Benefits:

  • Consistent and correct display of UTF-8 encoded text files in the WinCube Viewer

  • Elimination of display errors and improved user experience

  • Alignment of the WinCube Viewer with industry standards

Additional Information: Setting the mime-type to "text/plain;charset=UTF-8" does not help in this context, as it is not evaluated by the viewer. Therefore, improving the internal heuristic is necessary.

  • Attach files