this repo has no description
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

at master 29 lines 1.4 kB view raw view rendered
1## Data Pre-Processing Tasks for Knowledge Base Creation 2 3This project consists of different pre-processing tasks required on input file(s) before they can be used for respective Knowledge Base Creation (KBC). 4 5### Technologies Used 6 7These are all `Scala` based scripts / programs each representing individual pre-processing tasks built using `sbt` 8 9__Dependencies__ 10 11- `cats` - for typeclasses & data types 12- `monix` - for observables, non-blocking Task and parallel processing; in other words for all the side-effects 13- `pureconfig` - for typed configuration (if and when required) 14 15__how to run__ 16 17- make relevant changes to `application.conf` for the respective module (like associatekbc or domain) 18- `sbt run` command will ask you to select the `App` you want to run 19 20### TODO 21 22- [ ] replace current multiple main classes by multiple `sbt` projects 23- [ ] better way to do parallel & non-blocking IO for huge files without non-daemonic threads 24- [ ] TODOS from Domain 25 - [ ] refactor regex(s) and keep them in one place 26 - [ ] introduce free monads for actions and make the current implementation of parsing text as part of an interpretor there by making the whole parsing action extensible to any kind of input data 27 - [ ] once a free monad structure is introduced for domain objects, create new interpretors with Akka Stream or FS2 as effects to see if they help improve the performance 28 29_there will be bugs_