CPT Crosswords

1. Introduction
    1.1 System Requirements  1.2 Features  1.3 Versions
2. Browse Tab
    2.1 Choose Library Dialog  2.2 New Library Dialog   2.3 Folder Filters Dialog
3. Source Tab
    3.1 Operations  3.2 Filters Dialog
4. Base Tab
    4.1 Words Mode  4.2 Clues Mode
5. Target Tab
    5.1 Operations  5.2 Diagrams Mode  5.3 Words Mode  5.4 Clues Mode
    5.5 Sudoku Mode
6. How To
    6.1 Generate Diagrams  6.2 Generate Crosswords  6.3 Make New Base Words
    6.4 Make New Base Clues 6.5 Generate Unconstrained Sudoku
    6.6 Generate Sudoku Masks 6.7 Generate Constrained Sudoku
Appendix A: File Formats
    A.1 Text Dictionary  A.2 Tags  A.3 Rebus Definitions  A.4 Text Sudoku
Appendix B: Language Support
Appendix C: CPT Wizard
1. Introduction

CPT Crosswords is multilingual crossword compiler and sudoku generator. This program includes the modules CPT Diagrams, CPT Words, CPT Clues, CPT Sudoku, CPT Editor, part of dictionary tools from the CPT kit, and CPT Wizard.

CPT Editor supports all steps of the crossword creation from scratch to the printout. It is described in separate document.

CPT Diagrams is a generator, which creates diagrams in several styles.

CPT Words is a generator, which fills the diagrams with words.

CPT Clues is a simple clue generator, which adds clues and other information to form the complete crossword.

CPT Sudoku is a generator, which creates masks and puzzles.

CPT Wizard contains short tutorials and six wizards.

Before reading this document you should browse "CPT - The Primer" for the basic notions.

1.1 System Requirements

CPT Crosswords is available for MS Windows and Linux on PCs. Since it is Java program, some JVM (Java Virtual Machine or Java SDK/JRE) should be pre-installed. The program is adjusted to work with Sun's Java 1.1 and above on Windows and Linux. The MS JVM (jview) could be used as well.

The installation requires 10 MB disk space and the RAM depends on the OS used.

1.2. Features

1.3 Versions

The version of the program documented here is 1.3.
The pictures are from the Windows version, but the text covers the Linux version as well.
The versions of the modules used by this program are as follows: CPT Editor - 1.5, CPT Diagrams - 1.4, CPT Words - 1.4, CPT Clues - 1.4, CPT Sudoku - 1.0, CPT Wizard - 1.2, and dictionary tools - 1.5.

 

2. Browse Tab

After starting the program, you will see the following:

CPT Crosswords: Browse tab

In Folder and Folder Type you should set the folder where to start the CPT Editor from this tab. The folder types are:

For any folder type you can set one or more folders. To set or remove a path of the current folder type use the buttons 'Add New' and 'Remove Selected'.
Note: if you remove some of the folders created during the installation, the CPT Wizard will not work properly.

If the folder type is Files or Libraries, you can use the 'Folder Filters' button to start Folder Filters dialog where you can set filters for the files, which will be shown in CPT Editor.

Search In globally defines the library where to search when you click on 'Search for same items' button in Source/Target tab.

Save In globally defines the library where to save the crossword/sudoku set when you click on 'Save this set' button in Source/Target tab.

The saving for the Base tab is in Base Words/Clues folder.

Via CPT Mode: you can select the current mode of Source, Base and Target tabs. Diagrams will force the CPT Diagrams generator, Words will force the CPT Words generator, Clues will force the CPT Clues generator, and Sudoku will force the CPT Sudoku generator.

The 'Start' button will start the CPT Editor with the current selected folder.

The 'OK, save' button will save all current settings.

The 'Dismiss' button will stop the program.

2.1 Choose Library Dialog

This dialog will be shown when you start search or save operation from the Source or Target tab and the global setting is 'Choose Library'.

Choose Library dialog

Note that the list of library files is filtered by the size but you should take care about data formats. If you click on 'Dismiss' button, the operation will be canceled. To proceed with the operation, select a file from the list and click on OK button.

2.2 New Library Dialog

This dialog is used to create new library when you click on 'Save this set' button in Source/Target tab and the global setting in Save In is 'New Library' in Browse tab.

New Library dialog

If you click on 'Dismiss' button, the save operation will be canceled. To proceed with the save operation, click on 'OK, save' button (after setting the parameters).

Name Tab

The Dimensions fields are disabled and show the size of the current Source/Target set.

You can use the Data Format to force a new type of the library (with implied conversions and possible lost of data). Via RTL Numbers you can force conversion to RTL library (new format Grids+ and as input we have non-RTL diagrams).

In Name field you should enter the name of the file. Do not change the extension of the file - it will be set by the program depending on the data format.

Flags Tab

Use Style and Stage to set the global flags for this library. These flags are used for searching.

Encoding Tab

If the set you are saving contains text data, you should choose the proper encoding and locale (different from 'Default'). If the selected encoding is not the same as that of the input set, a recoding of all data will be done. Note that the user defined 8-bit converter can be set in CPT Editor.

2.3 Folder Filters Dialog

This dialog has the same layout as New Library dialog and is not shown here.

If the folder type is Libraries, in Dimensions fields you can enter fixed size or -1 to ignore the size filter. You can set also the Data Format and all other options as filters.

In Name field you can enter a regular expression for the file name. If the folder type is Files, only this field will be enabled.

 

3. Source Tab

The Source is a temporary crossword set, which is the result from the select operation and is the input source for the Target tab 'generate' operation. The program maintains one working file for all CPT modes.

3.1 Operations

Here you can define all options and start an operation for the current Source:

CPT Crosswords: Source tab

Dimensions text fields define the selected size (columns by rows). The fields are ignored when the Source is empty and the size is defined by the input for the select operation.

Selection field shows the number of items in the current Source.

'Filters for selection' button will start the Source Selection Filters dialog (disabled for Sudoku mode).

Select From defines the input for the select operation. Note that the selection from 'Target' is CPT Mode sensitive. For example, if current CPT Mode is Diagrams, the working target file for mode Diagrams is the current Target.

Data Format defines the forced data format for the select operation. If the Source is empty, you can select one of the data formats, which could impose a conversion.

The 'Start' button will start the select operation if Select check box is checked and/or will start the Editor if Show check box is checked,

Use Filters if checked, means to use the defined filters for the select operation (disabled for Sudoku mode).

Append if checked, means to append the result from the select operation to the current Source.

The 'Search for same items' button will start the search operation according to the global Search In settings from the Browse tab. All items from the Source set which are found in the search set will be marked as deleted and you should start the Editor to perform the deletion or to remove the marks.

The 'Save this set' button will start the save operation according to the global Save In settings from the Browse tab.

The 'Delete this set' button will start the delete operation - the current Source will be removed. You might need to delete the current Source in order to do a proper select operation (no append, format by selection) from the Folder Browser window in CPT Editor.

3.2 Filters Dialog

When you click on 'Filters for selection' button, the Source Selection Filters dialog will appear. The filters define the diagram properties, which will be checked during the selection.

Min/Max Words defines the minimum/maximum number of words the diagram should contain. The value of -1 means, that the filter is ignored.

Min/Max Word Length define the minimum/maximum length of words the diagram should contain. The value of -1 means, that the filter is ignored.

Min/Max % Blacks define the minimum/maximum percent black cells the diagram should contain. The value of -1.0 means, that the filter is ignored.

Min/Max % Unches define the minimum/maximum percent unchecked letters the diagram should contain. The value of -1.0 means, that the filter is ignored.

Max Black Pattern defines the maximum number of sequential blacks in row/column the diagram should contain. The value of -1 means, that the filter is ignored.

Max Items define the maximum number of items in the result of the select operation. The value of -1 means, that the filter is ignored.

User Data defines the value of user data field the diagram should contain. The value of -1 means, that the filter is ignored.

Check Style: forces the check of the selected style of the diagram.

Style defines the style of the diagram.

Has 4x4 White, Has 4x5 White, and Has 2x2 Black will force the checking of the corresponding rectangle of whites/blacks.

All of the remaining check boxed will force the checking of the corresponding property of the diagram: Convex, Standard Symmetry, Horizontal Symmetry, Vertical Symmetry, Barred, Unused Cells, Has Reversed, Has Data (has letters in the grid or has clues), Cells (the set is marked for the diagram generator), Marked (the set is marked by the diagram generator), User Flag (the set is marked by the user).

 

4. Base Tab

The Base is a word list (named Base Words for Words mode) or dictionary with clues (named Base Clues for Clues mode) used by the generators. Technically, it is a binary CTree file, a format supported by all CPT programs.

The Base tab contains all operations for the maintaining of the Base Words/Clues. These operations are creation, adding, deleting and saving. To change the current file, you should delete it first, and then use the Browse Folder window in Editor to select the new file from the proper Base Words/Clues folder. The tab will be disabled when the CPT Mode is Diagrams or Sudoku.

4.1 Words Mode

When in Browse tab the CPT Mode is Words, you are working with the CPT Words generator and you will see the following:

CPT Crosswords: Base tab in Words mode

The Words field shows the number of words in the current Base Words. If it is 0, the current file is deleted.

The button 'Set New Base Words Encoding' will start the Base Words Encoding dialog, where you should set the encoding and the locale when you are creating new Base Words.
You could also set the 'Rebus Definitions' check-box and select a file in order to use these Base Words for generation/editing of rebus type crosswords. In this case you can not use as source a CTree already having rebus definitions.

Select CTree Dictionary if checked, will force the input from the selected CTree. Via 'Set CTree File' button you should select the path of the file. 'Filters for selection' button will start the Tags As Filters dialog where you can set the filters for selection of words from the CTree.

Select Text Word List if checked, will force the input from the selected text file containing word list. Via 'Set Word List File' button you should select the path of the file. The 'Text Word List Encoding' button will start the Word List Encoding dialog where you should set the encoding and the locale of the file. The format of the file is just a word per line.

When Show check box is checked and you click on 'Start' button, the Base Words dialog window containing the CTree header will be shown. If the word list is in Unicode with many characters used, it could take some time to prepare and format the complete information.

When Select check box is checked and you click on 'Start' button, the process of creating new Base Words will be started. The Messages window, showing all messages of the creation, will appear.

If Use Filters is checked, the creation process will do the selection of words according to the filters (valid when the input is CTree file). If Append is checked, the program will add the selected input to the current Base Words.

During the creation process the words are converted to 'crossword form' - lower case and all non-letter characters are ignored (if custom composition is not used.) The custom composition is switched on for Thai/Hindi if the target encoding is Unicode or if you have checked 'Rebus definitions' for any other language.

When you click on 'Save' button, the Save dialog will be started in the current Base Words folder. You can save the file under any name, but you should not change the default extension. It is recommended to do the save operation after any new Base Words creation.

If you click on 'Delete' button, the current file will be deleted.

4.2 Clues Mode

When in Browse tab the CPT Mode is Clues, you are working with the CPT Clues generator and you will see the following:

CPT Crosswords: Base tab in Clues mode

The Clue Words field shows the number of words in the current Base Clues. If it is 0, the current file is deleted.

The button 'Set New Base Clues Encoding' will start the Base Clues Encoding dialog, where you should set the encoding and the locale when you are creating new Base Clues.

Select CTree Dictionary if checked, will force the input from the selected CTree. Via 'Set CTree File' button you should select the path of the file. 'Filters for selection' button will start the Tags As Filters dialog where you can set the filters for selection of words and associated clues from the CTree.

Select Text Dictionary if checked, will force the input from the selected text file containing words with clues. Via 'Set Text Dictionary File' button you should select the path of the file. The 'Text Dictionary Encoding' button will start the Text Dictionary Encoding dialog where you should set the encoding and the locale of the file. The format of the file is described in the appendix.

When Show check box is checked and you click on 'Start' button, the Base Clues dialog containing the CTree header will be shown.

When Select check box is checked and you click on 'Start' button, the process of creating new Base Clues will be started. The Messages window, showing all messages of the creation, will appear.

If Use Filters is checked, the creation process will do the selection of words according to the filters (valid when the input is CTree file). If Append is checked, the program will add the selected input to the current Base Clues.

During the creation process the words are converted to 'crossword form' (lower case and all non-letter characters are ignored). The clues data is not changed (the only exception is if recoding has been selected). The tags data is set according to current locale Tags file. The format of the Tags file is described in the appendix. Full details of creation of CTree with clues you can find in the documentation of CPT Word Lists program.

When you click on 'Save' button, the Save dialog will be started in the current Base Clues folder. You can save the file under any name, but you should not change the default extension. It is recommended to do the save operation after any new Base Clues creation.

If you click on 'Delete' button, the current file will be deleted.

 

5. Target Tab

The Target is a temporary crossword set, which is the result from the generation process. The program maintains four working files: one for CPT Diagrams generator, one for CPT Words generator, one for CPT Clues generator, and one for CPT Sudoku generator.

In Target Tab you can set the options, run the generators, and you can do all operations over the Target set. These operations are: creation (via the generators), browsing/editing (via the Editor), searching, saving, and deleting.

All generators are started as separate threads, so, while a generator is working, you could use the Editor for other task.

5.1 Operations

The 'Start' button will start the selected generator if Run check box is checked and/or will start the Editor if Show check box is checked.

The 'Search for same items' button will start the search operation according to the global Search In settings from the Browse tab. All items from the Target set, which are found in the search set will be marked as deleted, and you should start the Editor to perform the deletion or to remove the marks.

The 'Save this set' button will start the save operation according to the global Save In settings from the Browse tab.

The 'Delete this set' button will start the delete operation - the current Target will be removed.

5.2 Diagrams Mode

When in Browse tab the CPT Mode is Diagrams, you are working with the CPT Diagrams generator, and you will see the following:

CPT Crosswords: Target tab in Diagrams mode

The program uses the classical approach of 'generate and test'. The 'generate' part supplies B&W diagram and the 'test' part checks the conditions imposed by the user. If the diagram matches the requirements, it is written into the Target file. The variations are in the 'generate' part. The radio buttons No Source, Catenae, and Transform can be used to select one of the generation algorithm modes.

No Source. In this mode the program generates some combinations of blacks and whites depending of the algorithm number selected (for Algorithm 1 - all possible variants, see below). For diagrams of normal size this is the preferred generation mode.

Catenae. This is the engine for big diagrams. It takes diagrams from the Source file and assembles them into bigger ones. The process is some sort of catenation of building blocks and here comes the name. The generator produces B&W diagrams and takes as input only 'Diagrams' data format. Usually, the preferred input size is twice smaller than the target one. For example, to generate 8x8, the optimum input size is 4x4, but you can use other sizes as well. In the case of NY Times style with wrapper and Standard symmetry used for generation, the preferred input size is the half of the middle part. For example, for target size 15x15 the half of the middle part is 9x5 and this should be the input size.

Transform. The source diagrams are converted to target ones according to specified transformation operations (flipping, rotations and shifting). The output diagrams have the same size as the input ones with the exception when the operation is rotation and the sizes allow this operation.

In Dimensions text fields you should set the desired size of the diagrams. The field Result shows the number items in the current Target. When the check box Run is set and you press 'Start' button the generation will start. When the Show is set, but not Run, and you press 'Start' button, the Editor will be started with the current Target set. During the generation 'Pause' and 'Stop' buttons will appear on the button bar. Use them to pause or stop the generation process.

With Auto Source combo box you can ask the program to make the selection of the Source for you. The steps of this operation are as follows: according to the selected Target dimensions the software will look for libraries having proper dimensions as building blocks. If nothing is found it will show a message like "The proper 8x4 not found" and will stop. If something is found, the generation will start.

Generation Filters Dialog

'Filters for generation' button will start a dialog for the filters. The filters are the conditions you have to supply for the 'test' part of the generation. The dialog is almost the same as this one for Source selection filters and just the differences will be described here. In several check boxes the word "No" replaces "Has" and some check boxes are replaced with other ones supported only by the diagram generator.
The check box All Blacks Assigned is valid if the selected style is Scandy or Clues In. It means that the generator will not allow standalone black boxes which have no clue assigned. 'Standalone' are black boxes without clue which have no adjacent black box with clue assigned. If a black box without clue has a horizontal or a vertical neighbour with clue, the space of the former, in general, could be used for for the clue of the later and that's why the former in not 'standalone' and it will be allowed.
The check box Mark Used when set will force the generator to mark the used diagrams in the Source file. If the check box Cells is set, the generator will switch to mode for generating building blocks. In this mode some restrictions will be relaxed (e.g. the min length of words on the boundaries will be ignored, Convex property will be ignored) and the Target header flag Cells will be set. Black Grid will be added in Style list and it is supported for square diagrams and in No Source mode only.

Algorithm Options Dialog

The 'Algorithm options' button will start the dialog where you can choose the parameters for the 'generate' part of the generation.

Algorithm Options dialog

Show Generation means to show a window displaying the generated diagrams. After finishing the generation, you have to close this window because the program will not do this.

Sort Patterns is used to obtain different diagrams in No Source algorithm mode. If not checked, the patterns are used in ascending binary order - this is the recommended variant to start with. When checked, you should define the desired sorting. If Descending is not checked, the order is ascending. Binary means that the patterns are considered as binary numbers, which will define the order. Blacks mean that the number of blacks in the pattern will define the order. Random means that random numbers assigned to any pattern will define the order. When Random is checked, you can set the Seed value to -1 (current time is used) or to some positive integer value. The practice shows that via reordering of the patterns you can get results faster. For example, sizes greater than 17 can be obtained in seconds using random sorting.

The radio buttons from Algorithm 1 to Algorithm 4 are used to define how the generator should make the combinations. The first one will try all possible combinations and with increasing the number of algorithm the combinations will be less and less. Our advice is to start always with number 4 and then to try the others.

If No Source algorithm mode has been selected, the meanings are as follows:
Algorithm 1: unconditionally all possible variants.
Algorithm 2 - 4: mainly variants of symmetric diagrams - just the upper top-left quarter is generated and this part is used as building block using transformations for the final diagrams. You can obtain symmetric diagrams via Algorithm 1 as well, but here the variants are less and it is much faster. When Black Grid style is selected in filters, the program will slightly relax the requirements in order to obtain symmetric diagrams, but will not do this for Algorithm 1.
The style Scandy is supported in this mode as well but the required time is reasonable for small sizes. You could use Catenae mode for this style with proper Source. And a reminder: the program has no built in knowledge for all impossible variants, for example, there are no symmetric square Scandy diagrams of even sizes but the generation will start and will result in 0 after 5 seconds or after 5 hours.
When Clues In style is selected the program uses a time limit of 10 seconds for the 'hard' clue allocation algorithm in order to prevent blocking of the generation.

If Catenae mode is selected you can use Style Wrapper and Use StdSymm check boxes.

Style Wrapper if checked will force the generator to put the proper wrapper for the selected style from filters dialog. The wrapper for Scandy style is just the top row and the left column. The wrapper for NY Times style is of size columns x 3 on top and bottom, and of size 3 x rows on left and right. In the text field Number Patterns you could enter the maximum patterns to be used for the NY Times wrapper. The generated wrapper variants participate as building blocks in the generation. The other styles have no wrappers.

Use StdSymm if checked will instruct the generator to use standard symmetry. In this mode the upper half only of the diagram will be generated and the lower half will be obtained by the rules of the symmetry.

In Catenae and Transform modes you can choose some of the transformation operations below. They will be performed on the building blocks and this is a way to increase the possible variants for finding the desired diagrams.

The Flip operations include flipping on both diagonals and on vertical and horizontal axes. The rotation operations include 90 degrees clock-wise, 180 degrees clock-wise, and 270 degrees clock-wise. Shift Horizontal will do all possible shifts left and right with carry and Shift Vertical will do all possible shifts down and up with carry. In the fields From Position you can define the starting position for the shifting operations.


5.3 Words Mode

When in Browse tab the CPT Mode is Words, you are working with the CPT Words generator, and you will see the following:

CPT Crosswords: Target tab in Words mode

Dimensions fields show the current size (columns by rows).
The Target properties (dimensions, diagram style, ...) are predefined by the current Source. The locale and encoding are predefined by the Base Words.

The Resultfield shows the number of items in the current set.

'Generation algorithm options' button will start the Algorithm Options dialog.

Exclude Words if checked, will force the generator to skip the words from the given word list when reading the Base Words. Via 'Set Exclusion Word List File' button you should set the file path. 'Text Word List Encoding' button will start the Word List Encoding dialog where you can set the encoding and locale of the file.

Preferable Words if checked, will force the generator to add in the beginning the words from the given word list when reading or sorting the Base Words. This way, the preferable words always will be considered first during the compilation process. Via 'Set Preferable Word List File' button you should set the file path. 'Text Word List Encoding' button will start the Word List Encoding dialog where you can set the encoding and locale of the file.

To start the crossword generation check Run and click on 'Start' Button.

During the generation, in the place of Search, Save, and Delete buttons two other buttons will appear: 'Pause' is used to pause the generation, and 'Stop' is used to cancel the generator. In pause mode the 'Start' button is used to continue the paused process.

If you have checked Show Generation in Algorithm Options dialog, a window showing current status of the grid will appear.

CPT Words generator: generated grid

This is a sample of the final form of the window of Mixed+ mode generation. The title shows that the last grid was generated in less than 1 second, and the total time for all of the 50 grids is about 1 minute. The status bar contains the properties of the current diagram and the actual number of the words used during the generation of the current grid.

When the generation process is finished, the number of generated items will be set in the Result field. If the generation fails, an error message will be shown. The message 'Out of words' means that the generator is not able to find a solution using the current Base Words and the current algorithm options.
If the Base Words file has rebus definitions, the words will be shown in encoded form in this window. You may see a warning message for deleted grids due to repeated words after the decoding of rebus markers.

Algorithm Options Dialog

CPT Words generator: Algorithm Options dialog

Algorithm Tab

Algorithm Modes:

Unicode is supported by Words and Mixed+ modes. If you have selected Default or unsupported mode, the program will choose Mixed+.

Generally, the Mixed+ mode is the fastest one, but for some particular cases this is not true. Mixed mode could be faster for simple diagrams, Words mode could be faster for Unicode, and Cells mode is the fastest for double word squares.

Via the radio buttons Algorithm 1 to Algorithm 4 you should select one of the built in algorithms of the generator. They differ in the way the search list is built, and in the backtracking process. We have to note that these algorithm numbers are actually modifications of the selected algorithm mode. Only Algorithm 3 is 'sound' - it will find a solution, if at least one exists, but in most cases, it is the slowest one. If you want to find all possible solutions, use Algorithm 3, Max Variants = big number, and Max Common Words = -2 (see below). The other algorithms use 'unsound' techniques and heuristics in order to find quick solution. All options marked as 'unsound' are ignored when Algorithm 3 is selected.


Options Tab

Keep Source means that if there are preset letters in the grid, the generator should not delete these letters during the backtracking. Usually, you should set it to unchecked when the Source is unfinished result from a previous generation, otherwise, you will see immediately the message 'Out of words' (if you have not increased the size of Base Words).

Local Backtrack is one of the 'unsound' techniques. The words chosen using this technique will be drawn in blue color.

Max Backtracks is another 'unsound' parameter. The value of -1 is the default - the generator will choose some small number according to the algorithm mode. For algorithm mode Words the default value is 5 and for other modes it is 100. The value 0 is a special case - the generator will make a jump backtrack on the starting word/cell when a multiple repeated backtrack is detected. The other values will define the maximum number repeated backtracks the generator should allow. If you set a big number (e.g. more than the number of Base Words), this parameter becomes 'sound'.

Max Variants defines the number of target grids, which will be generated per source diagram. If this value is greater than 1, after finishing the current generation, the result is saved, the generator backtracks, and starts another variant using the same source.

Max Common Words defines the maximum number of common words of the variants, it is ignored if Max Variants is set to 1. When the value is negative (-1 or -2) the generator will not delete any words from the Base Words and will generate next variants using all available words. If the value is -1, it will backtrack on the starting word/cell. If the value is -2, it will backtrack on the last word/cell (this is used to find all possible variants with 'sound' search). Any other value will define the maximum common words the consecutive variants could contain and the backtracking is on the starting word. For example, the value of 0 means that all words used in the previous variants will be deleted from the Base Words. Note that this parameter affects only the variants obtained from one source diagram.

Save Unfinished will force the generator to save any current grid when it is stopped for some reason.

Min Prefer. Words when > 0 will force the generator to backtrack until that number of preferable words is used.

Add Rebus Marks is valid if the current Base Words have rebus definitions. If checked, the program will add a mark for any used rebus marker in the grid.


Sort Tab

Sort Base Words means to order the lists of word/letter candidates in particular order. If Descending is checked, the sorting order is descending, otherwise, it is ascending. Letter Frequency means that the order is defined by the normalized letter frequency of the word or by the letter frequency in the dictionary. Note that, if you want the words having bigger letter frequency to be in the front, Descending should be checked as well. Binary denotes binary sorting of the words/letters. Random means that a random number generator will be used to assign random number to any word/letter and the sorting will be according to that number. Seed is the initial value for the random generator. The value -1 means "take it from the computer clock". You can set other fixed value in order to get a fixed sequence from the random generator.


Start Tab

The Start Position group defines the starting word in the search list. By Algorithm leaves this task to the generator. Random forces the generator to choose randomly the starting word. Choose is the user choice - in the text fields you should set the column and row coordinates of the starting word. Across means that the coordinates are for across word, otherwise - down word.


Show Tab

Show Generation means to show a window displaying the status of the current grid. Nonstop if not checked, will pause the generator on any solution found, and you should click on 'Start' to continue. Words Numbers will show the word numbers in the grid. Grid Numbers will show the column and row coordinates. Upper Case will force the conversion to upper case of the shown letters. Delay will stop in any step the generation for a small amount of time. Steps to Refresh defines the frequency of refreshing the status window.

Draw 3D defines how the grid is drawn. If it is not checked the diagram will be drawn in black and white. Grid Numbers will show the column and row coordinates.

If the generator is paused and you start this dialog, you can change some of the options, and when you continue the generation, the parameters will be reflected. The options, which can not be changed in pause mode will be disabled. One more option will be shown in pause mode: Cancel Current (in Options Tab) means "cancel the generation for the current source grid and continue with the next source grid".

 

5.4 Clues Mode

When in Browse tab the CPT Mode is Clues, you are working with the CPT Clues generator and you will see the following:

CPT Crosswords: Target tab in Clues mode

Dimensions fields show the current size (columns by rows).

The Result field shows the number of items in the current set.

'Filters for selection' button will start the Tags As Filters dialog where you can set the filters for selection of clues from the Base Clues.

'Generation algorithm options' button will start the Algorithm Options dialog.

More Dictionaries allows including additional CTree dictionaries in the search process. Use 'Add New' button to set the path of the file. 'Remove Selected' button will delete the selected entry. Note that the searching in additional dictionaries is quite slow if they are not in 'crossword form' (they are scanned sequentially and any word is converted to 'crossword form' before the match test.)

During the generation you can use the 'Stop' button to cancel the generator.

If you have selected Interactive mode in Algorithm Options dialog, the Target Crossword dialog showing selected data will appear. When the generation process is finished, the generated number items will be set in the Result field. If the generation fails, an error message will be shown or a warning message if some clues/answers were not found.

Algorithm Options Dialog

CPT Clues generator: Algorithm Options dialog

Mode Tab

Interactive Mode will show the Target Crossword dialog, where you can browse/edit all of the generated data.

Show Tags will add information for the tags in front of the clue/answer text. This information is not included in the final text.

Include Answers and Include Title Data say to include these items as well in the crossword. The answers are the optional presentations of the words that could be shown in the printout. If in Base Clues and in the additional dictionaries there is a clue with tag 'xa', it is taken as answer, otherwise, the program will convert to upper case the first letter of the word and the result will be the answer. You should take care about the acronyms and multiple word names, because this approach obviously is not correct.

Reject Source Data means that any clues, answers or title data from the source will be ignored. If not checked, the data from the source will be selected.

Unicode Target means to convert the crossword data to Unicode. You will need to check this, if the encoding of the source grid and the encoding of the clue dictionaries are not compatible.

The Clue Selection group shows the simple strategies used to select the clues. Use Filters means that any clue not matching the filters will be ignored. First Found will stop the search process on first found clue. Shortest will select the shortest clue from the list of clue candidates. If it is not checked, the first found will be the initial selection.

The selection itself is done in the following order: 1) the source data; 2) the Base Clues dictionary; 3) all additional dictionaries. The first letter of any clue/answer taken from the additional dictionaries is converted to upper case.

Title Data Tab

In the text fields Title, Author, and Copyright you can enter the default data to be included in any of the generated crosswords.

Target Crossword Dialog

CPT Clues generator: Target Crossword dialog

If you click on 'Dismiss' button, the generation will continue in automatic mode. While you are in interactive mode, for any crossword you have to confirm the data using 'OK, save' button, and then the program will continue with the next one. To cancel the generator, use the 'Stop' button from the main window.

The words from the grid are shown on the left and the available data for the selected word will be shown on the right. When you change the selected word, the data on the right will be changed as well.

To change the initially selected clue, select the new clue from the list and click on Fix Clue. If you want to change the text of a clue, edit it in the text clue field and click on Fix Clue. The program will take the data from the text field only when the state of Fix Clue is changed from unchecked to checked.

Proceed the same way with the answer.

 

5.5 Sudoku Mode

When in Browse tab the CPT Mode is Sudoku, you are working with the CPT Sudoku generator and you will see the following:

CPT Crosswords: Target tab in Sudoku mode

Dimensions fields show the current sudoku size (columns by rows).

The Result field shows the number of items in the current set.

'Algorithm options' button will start the Algorithm Options dialog.

No Source if checked will force the program to generate the mask itself and you should give the desired sudoku size in Dimensions. If not checked, the masks (if needed) will be taken from the Source.

Algorithm Options Dialog

CPT Sudoku generator: Algorithm Options dialog

The Sudoku generator supports constrained mode (using mask of the givens) and unconstrained mode (without predefined mask).

Givens define the number of givens (clues) of the puzzle. Ignored in unconstrained mode and when a Source mask is used.

The Symmetry group can be used when the program has to generate masks. Standard means that the lower half is the same as the upper half but rotated 180 degrees clock wise. Diagonals is symmetry on both diagonals. Horizontal - the lower half is a reflection image of the upper half. Vertical - the right half is a reflection image of the left one.

The Sort Patterns group is used for different ordering of the mask and puzzle generation. If not checked the order of patterns will be always the same and the generation will produce the same results. If Descending is checked, the sorting order is descending, otherwise, it is ascending. Binary denotes binary sorting of the patterns. Random means that a random number generator will be used to assign random number to any pattern and the sorting will be according to that number. Seed is the initial value for the random generator. The value -1 means "take it from the computer clock". You can set other fixed value in order to get a fixed sequence from the random generator.

Show Generation means to show a window displaying the generated puzzles/masks. After finishing the generation, you have to close this window because the program will not do this. Steps to Refresh defines the frequency of refreshing the status window.

Via the Algorithm Mode group you should select the mode of generation.
Unconstrained forces the mode of generation without predefined masks. This is the most fast mode which produce low quality results.
Fixed Masks and Fixed Masks 3 are algorithms which use predefined masks. If 'No Source' is checked in Target Tab, the program will generate masks and then will start the sudoku generation. Note that in this case, if 'Max Variants' is set to 5, the final number sudokus will be 25.
Solve Source is just the solver module - the source should be a sudoku puzzle.

Via Difficulty you can define the desired complexity of the puzzles. 'Any' means that the generator will not check this property.

Check Details will force the program to find all properties of the puzzle.

Max Variants defines the number of puzzles per mask to be generated.

Max Backtracks if > 0 will reduce the search space of the generator.

Via Save As you can define the format of the Target library and what to generate. 'Grids' means masks plus all digits (solved sudoku), library file extension 'xlz'. 'Puzzles' - masks plus the given digits, library file extension 'glz'. 'Masks' - bit maps like diagrams but the 1's show the positions of the givens, library file extension 'dlz'. If 'Masks' is selected the program will generate only masks which latter can be used as Source for constrained generation (this is similar to CPT Diagrams generator).

 

6. How To

Here you will find step by step procedures for the most important tasks.

One general remark: when you start a generation and you see immediately 'Stopped' message, this means that the selected parameters are wrong or not supported.

 

6.1 Generate Diagrams

Ensure that in Browse tab the CPT Mode is Diagrams.

Quick Start

No Source, size 15x15, NY Times style:

No Source, size 15x15, NY Times and Clues In style:

Use the same procedure as above but in filters select Clues In in Style list, set Check Style, and set 10 in Max Items (here we impose a hard requirement).

No Source, size 15x15, Black Grid style:

Use the same procedure as for the first sample but in filters set 35 in Max % Blacks, -1 in Max % Unches, 5 in Max Black Pattern, select Black Grid in Style, set Check Style and clear all No White flags.

More Examples

The best way to understand how the generator works is to start with simple examples.

'No Source' , size 5x5:

You will see the generated diagrams on a window which will appear on the right. These samples probably are not what we would like. So, close the display window, run the filters dialog, set Standard Symmetry, put 2 in Max Black Pattern, enter 25 in Max % Blacks and run the generation again. The result is quite better, we hope. Play with the filter settings to see the effect on the generation. Do not delete the Target - we will use it in the next example.

'Catenae', size 10x10:

You will see the catenation process in action. Now you can play with Flip, Rotate and algorithm number. For Algorithm 1 the transformation operations will not increase the combinations but will slow down the search process. For some combinations of the flags the generator will not find any result and will show for a while "Stopped" message.

You can select the result in the Source and repeat this procedure with bigger size (20x20).

'Catenae', size 15x15, NY Times style:

Without proper Source the result will be nothing or some small number of diagrams. To improve the output, we will use style wrapper - in Algorithm Options dialog set Style Wrapper. This could add some more diagrams to the result. To get better results, we have to select/generate carefully our building blocks. Fortunately, the Libraries folder contains the proper sets. So, we choose All Libraries in Auto Source and run the generation again. After several seconds the new diagrams will begin to appear. It is a good idea before any long generation to check carefully the filters, e.g. set Max Word Length to 12, Max % Blacks to 20, No 2x2 Black and No 3 Black Corner should be checked. The essential properties for the style are always on and checked by the program in this mode.

Here is a table for some preferred input sizes for NY Times style with wrapper and standard symmetry used for generation:

Grid Size:

Input Size:

Grid Size:

Input Size:

Grid Size:

Input Size:

9 x 9

3 x 2

17 x 17

11 x 6

25 x 25

19 x 10

11 x 11

5 x 3

19 x 19

13 x 7

27 x 27

21 x 11

13 x 13

7 x 4

21 x 21

15 x 8

29 x 29

23 x 12

15 x 15

9 x 5

23 x 23

17 x 9

31 x 31

25 x 13

Catenae, size 17x17, Scandy style:

You can manually select the same cells (d16x16cells.dlz file) into the Source, set None in Auto Source and try other Target dimensions like 13x13, 15x15, and 15x21.

 

6.2 Generate Crosswords

Step 1. Select the Source for CPT Words generator

The Source type could be empty diagram or partially filled grid. The selection is usually made from Files folder (as shown in the sample below) or from Libraries folder. The number of the diagrams in the Source set is not limited.

In Browse tab select Folder Type 'Files', CPT Mode 'Words' and click on 'Start' button.

In CPT Editor's Folder Browser window select the file '77_a.ini', click on 'Select' button, click on 'OK, save' button, and click on 'Dismiss' button. Note that the old Source should be deleted in order to do the proper select operation.

Step 2. Run CPT Words generator

Select Target tab in the main window and click on 'Start' button (Run should be checked, and the proper options should be set in Algorithm Options dialog).

After the end of the generation close the grid status window (if shown), and optionally, you can save the new grids in a library file.

Step 3. Select the Source for CPT Clues generator

The Source type could be filled grid or a crossword with clues. In the sample below, the selection is made directly from the last generated grids. In general, you can use CPT Editor for the selection (as described in Step 1).

In Source tab choose Target in Select From, check the Select check box and click on 'Start' button.

Step 4. Run CPT Clues generator

In Browse tab switch the mode to Clues.

In Target tab check the Run check box and click on 'Start' button (the proper options should be set in Algorithm Options dialog).

After finishing the generation of clues, you can browse/print the new generated crosswords in Editor (click on 'Start' button with Show checked and Run unchecked). Note that any new generation will delete the old Target, and if you want to keep these crosswords, save them in a library file.

Generate Crosswords from CPT Editor

You could make a crossword in CPT Editor just in two steps:
- the selected diagram (or partially filled grid) should be opened in editor, choose Run | CPT Words Generator and Show Target;
- when you see the generated grid(s) in editor, choose Run | CPT Clues Generator and Show Target.
The dictionaries and the options for the generators should be set in advance.


Fight Against The Exponential Complexity

The CPT Words generator is able to compile tens of grids in seconds, but this is not the case with more complex diagrams and huge word lists, where years of CPU time might be necessary to search the space of the possible variants.

Before following the list of hints, take in mind these notes:
When you have a small list of words as Base Words or you intend to make a long run, it is preferable to use the 'sound' approach - choose Algorithm 3 or set Max Backtracks to a big number and Local Backtrack off. You can see the difference using the file 'en134.wlb' as Base Words. This is a test word list of 134 words, which should give exactly 48 variants for diagram of size 5x5, no blacks. All algorithms in 'sound' mode and Max Common Words = -2 will find all variants, while in 'unsound' mode not all variants will be found.
The small size of Base Words could result in 'Out of words' message, the increasing of the size will give good chance to the program but will increase the complexity as well.

Hint 1. Start with fast tests

Choose algorithm mode Default and Algorithm 1 or 2, Max Backtracks = -1 (or other small value), Local Backtrack on, sort Base Words with Descending and Cross Counters or Letter Frequency on. If there is no solution in several seconds, stop it, choose other algorithm or increase Max Backtracks and start again.

This is the default mode, which can give results in short time even for complex diagrams and big Base Words.

Hint 2. Reorder Base Words

Choose Sort Base Words with Random on and Seed = -1, start the generator and if there is no solution (or reasonable progress) in 10 seconds, stop it, and start again.

It is funny that we can fight against the complexity using random number generator, but the experience shows, that this often gives results. Many complex diagrams were solved this way (using hint 1 with random sorting, or Algorithm 1 in 'sound' mode - Max Backtracks is big number, and Local Backtrack is off).

Hint 3. Help the generator in run time

If the program is not able to get out of a 'cycle' for a long time, pause it, set Algorithm 1, Max Backtracks to 0 or 2 and run it again. When the program has overcome the 'hard place' in the grid, pause it, set Max Backtracks to a big number or switch to Algorithm 3 - this will help it to keep the words found.

Hint 4. Help the generator with the grid

Fill in by hand the longest words and some areas in the grid, choose the starting word.

This might sound like "don't use the program at all", but actually, the high quality crosswords are created this way - you have to choose the interesting theme words by yourself and leave the details to the compiler (this is true for the clues as well).

 

6.3 Make New Base Words

Note: this procedure will destroy your current Base Words (see the notes after Step 3).

Ensure that in Browse tab the CPT Mode is Words, and return to Base tab.

Step 1. Set encoding and locale of Base Words

Click on 'Set New Base Words Encoding' button and set the encoding and the locale in the dialog window.
If you want to use rebus definitions, set the check-box and optionally, select a file.

Step 2. Select input for Base Words

The input could be from a CTree dictionary (a good candidate is your Base Clues dictionary, if it contains enough words) or from a text word list. The format of the text word list is just a word per line. You can use one of the spell checking lists available on Internet, or you should create it.

Check one of the 'Select ...' radio buttons, set the file path and the filters for CTree or the encoding for the text word list.

Step 3. Start the creation

If you have set some filters, Use Filters should be checked. If you want to append the input to the current Base Words, Append should be checked.

Check the Select check box, and click on 'Start' button.

After the finish you should close the Messages window. It is a good practice to save the new Base Words in Base Words folder after any creation (click on 'Save' button).

 

6.4 Make New Base Clues

Note: this procedure will destroy your current Base Clues (see the notes after Step 3).

Ensure that in Browse tab the CPT Mode is Clues, and return to Base tab.

Step 1. Set encoding and locale of Base Clues

Click on 'Set New Base Clues Encoding' button and set the encoding and the locale in the dialog window.

Step 2. Select input for Base Clues

The input could be from a CTree dictionary (not recommended if you want to maintain the tags) or from a text file in 'Text Dictionary' format. The text format is described in the appendix. It is quite a tedious task to create this file, but the alternative is for any new crossword to type the same or similar clues again and again.
The creation process requires a CPT Tags file as well. This file is supposed to be in 'locale' directory, and having the name 'your_locale.tag', where 'your_locale' is the ISO locale code. If there is no such file, the 'default.tag' file will be taken.

Check one of the 'Select ...' radio buttons, set the file path and the filters for the CTree or the file path and the encoding for the text file.

Step 3. Start the creation

If you have set some filters, Use Filters should be checked. If you want to append the input to the current Base Clues, Append should be checked.

Check the Select check box, and click on 'Start' button.

After the finish you should close the Messages window. It is a good practice to save the new Base Clues in Base Clues folder after any creation (click on 'Save' button).   

 

6.5 Generate Unconstrained Sudoku

Ensure that in Browse tab the CPT Mode is Sudoku.

You will see the generated puzzles on a window which will appear on the right. After finishing the generation, close this window and optionally save the result (click on 'Save this set' button.) You could browse and print the results: check 'Show' ('Run' unchecked) and click on 'Start' button.   

 

6.6 Generate Sudoku Masks

Ensure that in Browse tab the CPT Mode is Sudoku.

In order to use the generated masks for constrained generation (see below) go to Source tab and delete the current set. In Select From choose 'Target', in Data Format choose 'By Selection'. Show unchecked, Select checked, Append unchecked. Click on 'Start' button to copy the Target set into the Source.   

 

6.7 Generate Constrained Sudoku

Ensure that in Browse tab the CPT Mode is Sudoku.

Using Source Masks

You could do the same from CPT Editor. The selected mask should be opened, choose Run | CPT Sudoku Generator and Show Target. The options for the generators should be set in advance.

Generating Masks and Sudoku in One Step

  

 

Appendix A: File Formats

A.1 Text Dictionary

The text dictionary format should have the following strict field's order (even for RTL texts stored in visual order):
word | morpho-tags | user-tags | topic-tags | clue-tags clue
For the creation of CTree with clues you can have more then one line per word (all these lines should start with the same word). Here is an example of a word entry for Base Clues:
accelerator|0|0|0|xc Device for controlling speed
accelerator|0|0|0|xa Accelerator
In this sample the answer (it is the second 'clue' having 'xa' tag) actually is not needed because it will be obtained by the default rules of the program. But if the word is all uppercase acronym or a multiple word name, the answer should be given. If there are several clues and answers per a word, any answer should follow the corresponding clue.

A.2 Tags

The format is quite complex and if you are really curious, check the documentation of CPT Word Lists. In 'locale' directory you can find some examples like 'enSample.tag'. Here is the sample of the minimal Tags file you will need ('default.tag' - optional tags are commented out).
Cp1252
# Morphology Tags
#<morpho
# ...
#>

# User Tags
#<user
# ...
#>

# Topics Tags
#<topic
# ...
#>

# Clues Tags
<clue
0  # 0 code unused
xc  clue
xa  answer
>

# END of all tags

A.3 Rebus Definitions

This file is used for creation of Base Words file having rebus definitions, which is used for generation/editing of rebus type crosswords. The first line should contain the encoding. The rest of the lines should contain definitions of the form marker-character:marker-string. It is similar to that in CPT Editor's Additional Properties dialog, Markers Tab. As 'marker-character' you should define a character which is not a letter used in the alphabet of the source word list. The following characters also can not be used as marker-character: ':', '*','=', and '.'.
The 'marker-string' is one or more letters of the alphabet, which will be represented internally by the marker-character. Here is a sample extracted from the file "en.reb":
Cp1252
0:up
1:one
2:two
{:ra
[:ro
The program expects the files with source rebus defintions to be in the subdirectory 'locale' and having names like 'en.reb', 'de.reb', 'ru.reb', and so on. You could also set a different location and file name in the dialog 'Set New Base Words Encoding'.
The Base Words file having rebus definitions will have an attached file '*.rdc' with the compiled definitions.
The CPT Words generator will add 'marker definitions' only for the used rebus definitions in any generated crossword. Optionally, the program can add marks for the used markers.

A.4 Text Sudoku

When in CPT Editor you select File | Save As File | TXT Sudoku the program will create a file of the form:
98..4....
...6...2.
1.......5
....15...
......83.
4........
.2.3.9...
......1..
...8.....
where '.' is used for an empty cell.

The external sudoku text library format is of the form:

.3.978....9....1.7.6...49..9...41..54.6.9.2.85..86...9..17...9.3.7....4....452.3.
.5.....7.67.243...18.....4...8.5.1..9.54.72.3..7.9.8...9.....35...876.29.4.....1.
Every puzzle takes one line. As empty cell the following characters are accepted: '.' , '0', '-' or '*'. In order to import a file into a CPT library, open the Editor in a library folder, and select New Library. Then in the Name tab of the New Library dialog you should check Import Sudoku Puzzles and give the input file via 'Set Text File' button.

Appendix B: Language Support

Letter Case

The default letter case of the words is lower. The dictionaries are created in lower case only, while the CPT Editor supports upper case as well. The special casing (Greek, German, ...), which maps one lower case character to 2 or 3 upper case characters, is supported in the display. For example, you can use the small German letter es-zed in the crossword grid, and if Upper Case is checked, 'SS' will be shown in the letter cell. The reverse is not true and that's why the default letter case is lower.

RTL Scripts

In the dictionaries and in the crosswords the RTL text should be in logical order.
The CPT Editor and the generators (Words, Clues) support 'right-to-left diagrams' as well. You can convert a diagram to RTL in Additional Properties dialog in CPT Editor via checking RTL Numbers. Note that these diagrams have data format 'Grids+'. CPT Diagrams generator does not produce directly RTL diagrams but you can save the Target as new RTL library via checking RTL Numbers in New Library dialog.

The text field controls support bidi processing without jumping selection for all Java versions when the proper RTL check box is set. More details you can find in "Language support" appendix in the documentation of CPT Word Lists.
We have to note that the dialog windows are LTR oriented even when they contain bidi enabled controls, and the meaning of the keyboard keys is always LTR as well.

Encoding

If the encoding you are using is not in the display list, use User Encoding dialog in CPT Editor to try to include it as user defined 8-bit converter. For example, the encoding ISO8859-13 (used by the Baltic languages) is not in the list but it is supported by the recent Sun's Java RTE (although I would suggest to recode your source data to other encoding). If there is 'single character per letter cell' problem, it could be solved using custom converters (like VN1 converter for Vietnamese).

Thai and Hindi

When you are creating new Base Words/Clues in Thai or Hindi languages, if you select Unicode as encoding, the custom Thai/Hindi Unicode normalization will be switched on. The default mode is 'Single Cells' (more details you can find in the documentation of CPT Word Lists). The mode is controlled by the lines "ThaiFull=0" and "HindiFull=0" in the properties file cpt_xw13.pr. If you want to switch to full syllable composition, use "ThaiFull=1" and "HindiFull=1".
If a crossword is in Thai/Hindi and in Unicode, the program will assume the custom normalization. For this reason the data recoding to/from Unicode is disabled. Note also that in CPT Editor you could properly enter 'letters'/words in a grid only via 'Search Base/Preferable Words' and 'Paste Horizontal/Vertical Word' modes.

If you want to work without the custom normalization, you should use a one-byte encoding. The one-byte encoding for Hindi is ISCII91, and since it is not in the built-in list, you have to set it as user 8-bit converter via CPT Editor's menu File | Set Default User Encoding. Of course, the using of 8-bit encoding for these languages is not recommended because the custom normalizations are created especially to solve the 'single character per letter cell' problem.


Appendix C: CPT Wizard

By default, when the program starts, this window will be shown as well:

CPT Wizard window

CPT Wizard module contains short tutorials and wizards for generating diagrams, grids, clues, crosswords from scratch, unconstrained sudokus, and constrained sudokus.

This module works with Java v. 1.2 and above. For the proper work of the wizards you should not delete sub-directories and files created during the installation.

The default options of all wizards are designed in a way that allows you to see the results in seconds. In the beginning, until you understand how the program works, it is better to select the Default radio button. When you start using the custom options, take in mind that all tasks (except the generation of clues) are of exponential complexity and you might not see the results in hours or days.

If you click on 'Open at startup' or 'Don't show again' check boxes, to keep these settings, click on 'OK, save' button on the main window.

Generate crosswords from scratch
This wizard is mainly a demonstration of the program's features and there are very few options. It creates complete crosswords using diagrams, words, and clues generators.
By default, a small list of words is used which is extracted from the supplied Base Clues dictionary (to be sure that all words have clues) and for sizes bigger than 15x15 the grid might not be filled in with words. If you want to use your current Base Words, set 'Small word list' check box to off (could result in empty clues in final crosswords.) 'Rebus Words' means to use a variant of the small word list built using rebus definitions and to create rebus type crosswords.
Since the diagram generation is not completely random in all cases, you could set the check box 'Random words' in order to get different crosswords in sequential runs.


top of page  |  cpt home