···11+# Created by .ignore support plugin (hsz.mobi)
22+dist/*
33+target/
44+lib_managed/
55+src_managed/
66+project/boot/
77+project/plugins/project/
88+.history
99+.cache
1010+.lib/
1111+*.class
1212+*.log
1313+/*.tsv
1414+/.idea/
1515+/src/main/resources/*
1616+!/src/main/resources/application.conf
1717+*.sc
1818+/.ensime
1919+/.ensime_cache/
+29
README.md
···11+## Data Pre-Processing Tasks for Knowledge Base Creation
22+33+This project consists of different pre-processing tasks required on input file(s) before they can be used for respective Knowledge Base Creation (KBC).
44+55+### Technologies Used
66+77+These are all `Scala` based scripts / programs each representing individual pre-processing tasks built using `sbt`
88+99+__Dependencies__
1010+1111+- `cats` - for typeclasses & data types
1212+- `monix` - for observables, non-blocking Task and parallel processing; in other words for all the side-effects
1313+- `pureconfig` - for typed configuration (if and when required)
1414+1515+__how to run__
1616+1717+- make relevant changes to `application.conf` for the respective module (like associatekbc or domain)
1818+- `sbt run` command will ask you to select the `App` you want to run
1919+2020+### TODO
2121+2222+- [ ] replace current multiple main classes by multiple `sbt` projects
2323+- [ ] better way to do parallel & non-blocking IO for huge files without non-daemonic threads
2424+- [ ] TODOS from Domain
2525+ - [ ] refactor regex(s) and keep them in one place
2626+ - [ ] introduce free monads for actions and make the current implementation of parsing text as part of an interpretor there by making the whole parsing action extensible to any kind of input data
2727+ - [ ] once a free monad structure is introduced for domain objects, create new interpretors with Akka Stream or FS2 as effects to see if they help improve the performance
2828+2929+_there will be bugs_