Četnost znaků v českém textu, využitelné zejména pro frekvenční analýzu textu.
| Znak | Četnost (%) | 
|---|---|
| a | 6.2193% | 
| á | 2.2355% | 
| b | 1.5582% | 
| c | 1.6067% | 
| č | 0.9490% | 
| d | 3.6019% | 
| ď | 0.0222% | 
| e | 7.6952% | 
| é | 1.3346% | 
| ě | 1.6453% | 
| f | 0.2732% | 
| g | 0.2729% | 
| h | 1.2712% | 
| ch | 1.1709% | 
| i | 4.3528% | 
| í | 3.2699% | 
| j | 2.1194% | 
| k | 3.7367% | 
| l | 3.8424% | 
| m | 3.2267% | 
| n | 6.5353% | 
| ň | 0.0814% | 
| o | 8.6664% | 
| ó | 0.0313% | 
| p | 3.4127% | 
| q | 0.0013% | 
| r | 3.6970% | 
| ř | 1.2166% | 
| s | 4.5160% | 
| š | 0.8052% | 
| t | 5.7268% | 
| ť | 0.0426% | 
| u | 3.1443% | 
| ú | 0.1031% | 
| ů | 0.6948% | 
| v | 4.6616% | 
| w | 0.0088% | 
| x | 0.0755% | 
| y | 1.9093% | 
| ý | 1.0721% | 
| z | 2.1987% | 
| ž | 0.9952% | 
Četnost znaků v českém textu (%)
Bigramy
ST, PR, SK, CH, DN, TR
Trigramy
PRO, UNI, OST, STA, ANI, OVA, YCH, STI, PRI, PRE, OJE, REN, IST, STR, EHO, TER, RED, ICH
Kód
    /**
     * Vypise na vystup cetnost jednotlivych znaku souboru (v procentech), 
     * ignoruje znak noveho radku
     * @param source zdrojovy soubor
     * @param encoding kodovani souboru
     */
    public static void count(File source, String encoding) throws UnsupportedEncodingException, IOException{
        BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(source), encoding));
        
        TreeMap<Character, Integer> occurences = new TreeMap<Character, Integer>();
        String s = null;
        int counter = 0;
        while((s = reader.readLine())!= null){
            for(int i = 0; i < s.length(); i++){
                counter++;
                Character curr = (Character) s.charAt(i);
                if(occurences.get(curr) == null){
                    occurences.put(curr, new Integer(1));
                } else {
                    occurences.put(curr, occurences.get(curr).intValue() + 1);
                }
            }
        }
        for(Character ch : occurences.keySet()){
            System.out.println(ch.toString() + ": " + (occurences.get(ch).intValue()/(double)counter * 100));
        }
    }
Literatura
- KRÁLÍK, Jan. Czech Alphabet. The Czech Language [online]. 2001 [cit. 2012-09-18]. Dostupné z: http://www.czech-language.cz/alphabet/alph-prehled.html