Thread
Sudden dip in model performance! You're confused!

Its your 'fav' colleague Karen, again!
"Choice of algthm sucked from the start!
"That series of costly label updates? What a waste!" She rubs it in.
.
.
.
.
Wait, the last part rings a bell!
Ohh..Karen's lost this one too!馃槅

1/5
You're comforted from knowing of a 'provision' you made.

A provision that keeps track of the 'origin' of each of the data samples and its labels.

This technique is called 'Data Lineage'

Why is it important?

2/5
You're 'never sure' that samples are labeled correctly.

With Data Lineage:
路 you can track the source of data.
路 re-examine the correctness of labeled samples.

Without it:
路 the old and the new samples are mixed.
路 its hard to tell which ones are causing the issue.

3/5
Performance dip is not always a problem of the model.

Data updates are tricky!

The costs you incur with data updates may far exceed the benefits (when lineage isn't preserved).
And Karen is going to have a field day!

Data Lineage will go a long way in preventing this!

4/5
Thanks for reading!

If you enjoyed this thread, consider leaving a like and follow me @farazmunshi for more on ML and AI.

See you!

5/5

Mentions
See All