Skip to content

Ki Data (KD)

dleuck edited this page Mar 10, 2021 · 155 revisions

Introduction

Ki LogoKi Data (KD) is a simple and concise declarative language used to describe typed values, ranges, lists, maps, trees and grids. Although XML is an excellent format for marking up documents and embedding tags in free-form text, it can be cumbersome for expressing basic data structures. KD is designed to be readable at a glance. See the FAQ for a comparison of KD to other declarative languages.

KD is ideal for uses ranging from build files and object serialization to UI layouts and automation scripts. Its modern type system, schema, support for quantities with SI units, and high precision decimal type make it well suited for apps in areas such as finance and STEM. In addition to data structures, KD provides portable higher order types for URLs, versions, dates, times, durations, quantities with units, and ranges.

KD Tag

Anatomy of a Tag

Examples

Anatomy of a Tag

@Personal // I'm an annotation
favorite_books {
  book "The Hobbit" author="J. R. R. Tolkien" published=1937/9/21
  book "Dune" author="Frank Herbert" published=1965/8/1
}

requires {
  module http://kixi.io/packages/widgets version = 5.2.beta-2 .. 5.9
  module http://kixi.io/packages/kd version = 5.0.0 .. _
}

@UI(framework="HTML5")
@Test
browserBar layout="horizontal" margin=[horizontal=4 vertical=2] {
  button icon="arrow.left" action="browser.back"
  button icon="arrow.right" action="browser.forward"
  spacer 10
  webkit:urlField constraint="fill" action="browser.toURL"
  button "Go" style="simple"
  spacer 10
  menubutton {
    "New Tab" action="browser.newTab"
    "New Window" action="browser.newTab"
    separator
    "View Source" action="browser.viewSource"
  }
}

Whitespace and Lines

Newlines are significant in KD. Other whitespace is ignored. You can force a newline with a semicolon (;) and escape the end of a line with a backslash (\) to continue the tag definition on the next line.

Ki Type System (KTS)

KD uses the portable Ki Type System, which it shares with Ki Markup and Ki Script. In the example above the value of the "published" attribute is a date object, not a string. Ki's intuitive API is available for supported platforms.

Structure of a Tag

The structure of a tag is:

  annotation(s)
  namespace:name value(s) attribute(s) {
    children
  }

A tag must have either a name or, in the case of an anonymous tag, a value. All other components are optional. If the name portion is omitted the tag name defaults to the empty string (""). Names and namespaces are identifiers.

For values and child tags, order is significant and duplicates are allowed. For attributes, order is not significant and duplicates are not allowed.

Annotations appear before or above the rest of the tag. The can be a simple name such as @Test or parameterized with values and/or attributes like a tag. Annotations provide metadata for a tag which can be used for post-processing, logging, testing, etc.

Annotated Tag Examples

@Test tag "Some data"

@Test(true)
tag "Some data"

@Test(true log="output.txt")
tag "Some data"

Tag Definition Examples

  1. A name: book
  2. A name with a value: book "Lord of the Rings"
  3. A name with a list: nums 3 5 7 11
  4. A name with attributes:pets chihuahua="small" dalmation="hyper" mastiff="big"

A tree of tags (i.e. tags with children)

plants {
    trees {
        deciduous {
            elm
            oak
        }
    }
}

A matrix

myMatrix {
   4  2  5
   2  8  2
   4  2  1
}

A Tree of Nodes with Values and Attributes

folder "myFiles" color="yellow" protection=on {
    folder "my images" {
        file "myHouse.jpg" color=true date=2005/11/05
        file "myCar.jpg" color=false date=2002/01/05
    }
    folder "my documents" {
        document "resume.pdf"
    }
}

Children are tags and may be nested to an arbitrary depth. They are indented by convention but spaces and tabs are not significant in the KD language.

Tags can be listed separated by semicolons on the same line:

    tag1; tag2 "a value"; tag3 name="foo"

Anonymous Tags

Tags with no name are known as anonymous tags. They are automatically assigned the empty string ("") as their name.

Example: An Anonymous Tag

greetings {
   "hello" language="English"
}

# If we have a handle on the "greetings" tag we can access the
# anonymous child tag by calling
#    var greeting = greetingTag.getChild(0)

💡 Anonymous tags must have at least one value

Anonymous tags must have one or more values. They cannot contain only attributes. This design decision was taken to avoid confusion between a tag with a single value and an anonymous tag containing only one attribute.

# Not allowed: An anonymous tag with a single attribute (and no values)...
size=5

# ...because it could easily be confused with a tag having a single value
size 5

Identifiers

An Ki identifier starts with a unicode letter, emoji or underscore (_) followed by zero or more unicode letters, numbers, underscores (_), and dollar signs ($). Examples of valid identifiers are:

  • myName
  • _myName
  • 😀myName
  • myName123
  • my_name
  • my_name_

Literal Types

KD supports 21 literal types, some of which have subtypes. They are:

Type Description Examples Notes
String Unicode text "hello", "ท้องฟ้า", "대양" See String
Char Unicode character 'A', 'Ж', '桜'
Int 32b signed integer 123, 424_235_412, 0xFF Optional underscores for legibility
Long 64b signed integer 123L, 424_235_412L
Float 32b signed float 123.43F, 123.43f
Double 64b signed float 123.43, 123.43D, 123d, 5.421_523 Optional underscores for legibility
Dec 128b+ signed decimal 123.44BD, 123.44bd See notes below*
Bool boolean true, false
URL URL https://www.nasa.gov, ldap://ds.example.com:389
Date simple date 2005/12/05, 1592/4/8 See Date
LocalDateTime local date & time 2021/12/05@05:21:23.53 See LocalDateTime
ZonedDateTime date & time w/ zone 2021/12/05@05:21:23.53+03 See ZonedDateTime
Duration length of time 12:74:42, 23days:05:21:23.53, 85ms See Duration
Version Version descriptor 4.2.5, 5.2-beta, 2.8.5-alpha-2 See Version
Blob Bytes encoded in base64 .blob(sdf789GSfsb2+3324sf2) See Blob
Quantity<U> An amount in Units 23cm, -15mL, 2.5m3, 2.5m³ see Quantity
Range<T> Range for comparables 2..5, 2.0..<3.5, 2.0<.._ See Range
List<T> Ordered list [1 2 3], [4, 5, 6], ['a' 'b' 'c'] Commas optional; See List
Map<K, V> Set of key=value pairs [name="Rufus" color="rust"], ['a'=1, 'c'=3] Commas optional; See Map
Call Function call rgb(120, 140, 20, alpha=.5) Commas optional; See Call
nil Absence of a value nil, null

*For platforms that do not support this level of precision, decimal should resolve to the most precise decimal representation possible and produce a warning.

💡 See Ki Types for more details on the type system including super types

String

There are four types of Strings in KD. They are:

Name Description Examples Notes
Simple Single line with escapes "Hello", "Line1\nLine2" '\n' will be escaped
Raw Single line without escapes @"\files\readme.md", @"whitespace:\t\n" '\' and white space "as is"
Block Multi-line with escapes """
One
Two
"""
BlockRaw Multi-line with no escapes @"""
newline: \n
slash: \
"""
Alternative form: `
newline: \n
slash: \
`

💡 Notes

  1. For compatibility with SDL and JavaScript, KD provides an alternate form of BlockRaw that uses a backquote `...` rather than @"""...""".
  2. The white space prefix of lines in Block and BlockRaw is truncated if it matches the white space before the closing quotes ("""). Example:
myTag text="""
    ABC
        def
    123
    """

The String block for "myTag" will remove two spaces from the beginning of each line and produce the string:

"""
ABC
    def
123
"""

The leading quote mark's location is disregarded, and may appear on the same line as the attribute or below. This behavior is identical to Swift's multi-line String literals.

Date and DateTime Literals

KD's date and time related types are:

  • Date A simple date with year/month/day.
  • LocalDateTime A local (relative) DateTime with year/month/day@hour:minute(:second(.fractional_secong)?)?
  • ZonedDateTime A DateTime with a time zone (see below)

💡 A time-of-day that isn't attached to a date can be represented by using a Duration treated as an offset from midnight.

For LocalDateTime and ZonedDateTime single digit month of the year, day of the month, hour, second and fractional second do not require leading zero padding when parsed. Zeros will be used in the canonical representation of single digit months of the year, days of the month and hours of the day, but not seconds of the minute or fractional seconds. For example, and input of 2020/5/9@2:53:2.5 will be parsed correctly, and will output: 2020/05/09@02:53:2.5.

Fractional second accuracy goes to 9 digits (i.e. nanoseconds). You can use underscores to improve legibility. Example: 8.352_432_632 for 8.352432632 seconds.

Date

Date literals in KD are specified as year/month/day. Examples:

  • 2005/12/05
  • 2020/09/18
  • 2020/9/18

LocalDateTime

Local DateTime literals don't specify a time zone. They are to be interpreted as local (relative) time. The timezone is never set, and is to be considered relative to the time and place of the reader.

Format: yyyy/mm/dd @hh:mm(:ss(.fractional)?)?
Example Description
2005/09/05 @05:08:03.532 Date and time with all components zero padded (canonical form)
2020/9/9 @05:08:3.532 Date and time with zero padding omitted

ZonedDateTime

ZonedDateTime literals specify a date, time and zone. Zones can be specified with -Z, -UTC, [+/-]offset, or -KiTZ:

Format: yyyy/mm/dd @hh:mm(:ss(.fractional)?)?(-Z | -UTC | [+/-]offset | -KiTZ)

💡 Z and UTF without an offset are equivalent to UTC-0. The canonical representation is Z._

Example Description
2005/12/05@05:21:23.532-Z UTC (no offset)
2005/12/05@05:21:23.532-UTC UTC (no offset)
2005/12/05@05:21:23.532-JP/JST Ki Time Zone (KiTZ) JP/JST: Japan Standard Time
2005/12/05@05:21:23.532-US/PST Ki Time Zone (KiTZ) US/PST: US Pacific Standard Time
2005/12/05@05:21:23.532+2 UTC+02 offset
2005/12/05@05:21:23.532-2 UTC-02 offset
2005/12/5@05:21:23.532+2:30 UTC+02:30 offset

Duration

Ki Duration literals represent a length of time (which may be negative.) Duration literals are useful for expressing the duration of an event, intervals, or chronological distances from a reference point.

Examples: Duration Literals

Example Description
03:00:00 or 3h 3 hours
00:12:00 or 12min 12 minutes
00:00:42 or 42s 42 seconds
00:12:32.423 12 minutes, 32 seconds, 423 milliseconds
00:12:32.000_002_584 12 minutes, 32 seconds, 2,584 nanoseconds
30day:15:23:04.023 30 days, 15 hours, 23 mins, 4 secs, 23 milliseconds
-00:02:30 2 hours and 30 minutes ago
-2day:00:04:00 About 2 days ago
15days 15 days
hours 16h 16 hours
23min 23 minutes
2.5min # =150s 2.5 minutes
15s 15 seconds
10.25s 10.25 seconds
12ms 12 milliseconds
54664ns 54,664 nanoseconds
3days..5days 3 to 5 days (inclusive)
hourRange 1h..<10h Between 1 and 10 hours (exclusive-right)

💡 Notes

  • Unless you are using a single number with units specified, hours, minutes, and seconds are required. Days and milliseconds are optional.
  • If the day component is included it must be suffixed with the unit day or days.

Version

The KD Version type uses a simple schema based on Semantic Versioning 2. All numeric components must be zero or positive integers.

Format: `major('.'minor('.'micro)?)?('-'qualifier(('-')?qualiferNumber)?)?`

Version Components

Component Role Description
major Major feature release, possible breaking changes positive integer
minor Minor feature release, no breaking changes positive integer
micro Bug fixes and performance enhancement positive integer
qualifier Text label, e.g. "alpha", "beta" or "RC" Unicode string
qualifierNumber A positive integer (requires a qualifier) positive integer

Comparing Versions

For the purpose of comparison and inclusion in ranges, the sort order compares numeric components and qualifier, if present, ignoring case. Versions that have qualifiers are sorted below versions that are otherwise equal without a qualifier (e.g. 5.2-alpha is lower than 5.2). Qualifiers are sorted by alphabetical ordering (case insensitive), and finally the qualifier number, if present, is compared.

Examples

  1. 5.2.7
  2. 5-beta
  3. 5.2-alpha
  4. 5.2.7-rc
  5. 5-beta-1
  6. 5-beta1
  7. 5.2-alpha-3
  8. 5.2.7-rc-5

Quantity

A Quantity is an amount of a Unit. KD supports all SI base axes and several popular derived axes:

SI Base Axes

  1. Time (via Duration)
  2. Length
  3. Mass
  4. Temperature (Celsius & Kelvin)
  5. SubstanceAmount
  6. Current
  7. Luminosity

SI Derived Axes

  1. Area
  2. Volume
  3. Speed
  4. Density

Other derived axes are likely to be added in future versions.

You specify them using standard SI symbols. There is an exception for Liter (L). Due to a conflict with the "L" suffix for long literals, liter is written with "LT" or "ℓ" (e.g. 5LT or 5ℓ).

By default Quantities use the high precision Dec type for their value. You can override this behavior by adding a :i, :L, :d or :f suffix after the unit type for Int, Long, Double or Float.

Canonical Form Examples

  1. 23cm
  2. 51.4m3 or 51.4m³
  3. 97LT or 97ℓ
  4. 1000kg or 1_000kg
  5. 142.24km:d # This forces the value to be stored as a Double rather than a Dec

Range

Ranges can be created for all comparable KTS types (numbers, chars, Strings, etc.) They can be inclusive or exclusive on either side. They can also be open on either side. Here are some examples:

Example Description
1..5 Range<Int> 1 - 5 (inclusive)
5.0<..<15.0 Range<Double> > 5, < 15 (exclusive)
2<..17 Range<Int> > 2, <= 17 (exclusive-left)
6..<12 Range<Int> >= 6, < 12, (exclusive-right)
6.._ Range<Int> >= 6 (open-right)
_..100 Range<Int> <= 100 or less (open-left)
8<.._ Range<Int> > 8, (exclusive-left, open-right)
'a'..'f' Range<Char> >= 'a', <='f' (inclusive)
7.2.5-beta-2.._ Range<Version> >= 7.2.5-beta-2 (inclusive, open-right)
4h..<10h Range<Duration> >= 4 hours, < 10 hours, (exclusive-right)
7mm..12cm Range<Quantity> >= 7mm, <= 12cm, (inclusive)

Blob

KTS Blob literals are base64 encodings of byte arrays. KD uses standard (canonical) Base64 encoding as specified in RFC 4648.

Format: .blob(encoding_chars)

Examples: Blob Literals

key .blob(sdf789GSfsb2+3324sf2) name="my key"
image .blob(
    R3df789GSfsb2edfSFSDF
    uikuikk2349GSfsb2edfS
    vFSDFR3df789GSfsb2edf
)
upload from="ikayzo.org" data=.blob(
    R3df789GSfsb2edfSFSDF
    uikuikk2349GSfsb2edfS
    vFSDFR3df789GSfsb2edf
)

💡 When Blob is output from a KD tag or via Ki.formatBlob(literal), it will chunk the date into 60 char lines terminated by new lines. This output is parsable by the KD parser and Ki.parseBlob(literal).

List

KD Lists are written in square brackets with space and/or comma separated values. Examples:

  • [1, 2, 3, 4] # List of Ints
  • [5 6 7 8] # Commas are optional
  • ['a' 'b' 'c'] # List of Chars
  • [[1 2] [3 4]] # List of Lists
  • friends ["Pedro", "Rika", "Naisha"] type="closest" # Tag with a List value and an attribute

Map

KD maps are written in square brackets with entry pairs separated by =. Examples:

  • [Spanish="hola", Fijian="Bula"] # Keys using naked (quoteless) strings
  • [Spanish="hola" Fijian="Bula"] # Commas are optional.
  • ['a'="Ant", 'b'="Bird"] # Keys and values can be any KD type, including Lists and Maps.

Call

A KD Call is a data representations of a function invocation. It supports positional and named parameters. Examples:

  • rgb(200, 100, 120)
  • rgb(200 100 120) # Commas are optional.
  • rgb(200, 100, 120, alpha=.5) # positional and named parameters

Grid

Grids in KD are really just a list of lists made from the values of anonymous children tags. They are an API feature rather than a language feature.

matrix {
    1  2  3
    4  5  6
    7  8  9
}

grid {
    1  3  5
    7  9  11
}

// In Kotlin you would access the matrix and grid as a list of rows:
var rows = tag.getChild("matrix").childrenValues as List<List<Int>>

💡 In the future the Ki.Core library may introduce a generic grid type for use by KD.

Comments

KD supports single line comments beginning with # or // and C-style multiline comments that begin with /* and end with */. Multiline comments can be nested. Examples:

myInts 1 2 /* 3 */ 4 # note: this list will contain 1, 2 and 4

tag1 "fee"
/*
tag2 "fi"
tag3 "fo"
*/
tag4 "fum" // We are done!

KD Files

KD files (any file ending with the extension .kd) should always be encoded using UTF-8. The use of unicode escaping (such as the \uxxxx format used by Java and C#) is supported but not required. Non ASCII characters should be entered directly using a UTF-8 capable editor.

💡 ASCII is transparently encoded in UTF8, so ASCII files can be used if only ASCII characters are required.

Example KD File

# a tag having only a name
my_tag

# three tags acting as name value pairs
given_name "Akiko"
family_name "Johnson"
height 68
daily_reading 1h..2h # 1 to 2 hours

# a tag with a value list
person "Akiko" "Johnson" 68 kids=["Yuka" "Naoki"]

# a tag with attributes
person first_name="Akiko" last_name="Johnson" height=68

# a tag with values and attributes
person Akiko Johnson height=60 # Values are using quoteless strings

# a tag with attributes using namespaces
person name:first-name="Akiko" name:last-name="Johnson"

# a tag with values, attributes, namespaces, and children
my_namespace:person "Akiko" "Johnson" dimensions:height=68 {
    son "Nouhiro" "Johnson"
    daughter "Sabrina" "Johnson" location="Italy" {
        hobbies "swimming" "surfing"
        languages English Italian 
        books_per_week 2..4 // range from 2 to 4
        smoker false
    }
}   

# -----------------------------------------------------------------

# a log entry
#     note - this tag has two values (date_time and string) and an 
#            attribute (error)
entry 2005/11/23@10:14:23.253-GMT "Something bad happened" error=true

# a long line
mylist "something" "another" true "shoe" 2002/12/13 "rock" \
    "morestuff" "sink" "penny" 12:15:23.425
   
# anonymous tag examples

files {
    "/folder1/file.txt"
    "/file2.txt"
}
    
# To retrieve the files as a list of strings using Java
#
#     var files = tag.getChild("files").getChildrenValues();
# 
# We use the empty string ("") because the files tag has two children, each of 
# which are anonymous tags (values with no name.)  These tags are assigned
# the empty string ("").

Uses

Because of its terse syntax and type inference capabilities, KD is ideally suited to applications such as:

  • Configuration Files
  • Build Files
  • Property Files
  • Object Serialization
  • Rules engines
  • Automation scripts
  • UI Layouts (forms, etc.)
  • Log files (formatting and parsing)

KD was designed to be language agnostic. Currently the JVM implementation is in beta 2. It is written in Kotlin with annotations that make it easy to access in Java and other JVM languages such as Scala and JRuby. The Swift implementation is in alpha and preliminary work has begun on Python and Javascript.

References

The following languages were researched during the development of KD for the purpose of defining base types and literal formats:

Declarative Languages General Purpose
XML Swift
JSON Kotlin
RDF Java
PYX (link from Pat Niemeyer) C#
YAML Ruby
OGML TypeScript
SDL (KD's predecessor) Dart
Go