Custom parsers
If you need more control over the constraints the LLM follows, you can define your own parsers.
Defining Constraints
Kalosm provides a set of parsers that can be combined to define constraints. The following base parsers are available:
LiteralParser
: Matches a literal string.IntegerParser
: Matches an integer (along with parsers for each rust integer type).FloatParser
: Matches a float.StringParser
: Matches a string.SeparatorParser
: Matches any number of items separated by a separator.IndexParser
: Matches any of a set of parsers and returns the index of the matched parser.StopOn
: Matches anything until a literal.WordParser
: Matches a single word.VecParser
: Matches a vector of items.
And you can combine them using the following combinators:
then
: Matches the first parser followed by the second parser.or
: Matches the first parser or the second parser.repeat
: Matches the parser a specified number of times.
In this example, we will create a parser that completes a sentence with only valid states by combining the LiteralParser
and IndexParser
:
use kalosm::language::*; // Create a list of parser for states let states = ["Alaska", "Delaware", "Florida", "Georgia", "Hawaii"]; let states_parser = states .into_iter() .map(LiteralParser::from) .collect::<Vec<_>>(); // Create a parser that tries to match each state let states = IndexParser::new(states_parser); // match a state, followed by a comma and a space, 5 times, and a newline let _validator = states .then(LiteralParser::from(", ")) .repeat(5..=5) .then(LiteralParser::from("\n"));
If you don't care about the output of the parser, but you want the LLM to adhere to a specific structure, you can also use a RegexParser
to match a regular expression:
let regex = RegexParser::new(r#"((Alaska|Delaware|Florida|Georgia|Hawaii), ){5}\n"#).unwrap();
Generating Text
Once you have defined a parser, you can generate text that adheres to the constraints defined by the parser. You can call with_constraints
on a text stream or chat stream to force the model to adhere to the constraints defined by the parser:
let llm = Llama::phi_3().await.unwrap(); let task = llm .task("You generate realistic characters for a procedurally generated game.") .typed(); let mut stream = task("Generate a character that is a wizard"); stream.to_std_out().await.unwrap(); let character: Character = stream.await.unwrap(); println!("Result: {:?}", character);