Sudden dip in model performance! You're confused!

Its your 'fav' colleague Karen, again!
"Choice of algthm sucked from the start!
"That series of costly label updates? What a waste!" She rubs it in.
Wait, the last part rings a bell!
Ohh..Karen's lost this one too!馃槅

You're comforted from knowing of a 'provision' you made.

A provision that keeps track of the 'origin' of each of the data samples and its labels.

This technique is called 'Data Lineage'

Why is it important?

You're 'never sure' that samples are labeled correctly.

With Data Lineage:
路 you can track the source of data.
路 re-examine the correctness of labeled samples.

Without it:
路 the old and the new samples are mixed.
路 its hard to tell which ones are causing the issue.

Performance dip is not always a problem of the model.

Data updates are tricky!

The costs you incur with data updates may far exceed the benefits (when lineage isn't preserved).
And Karen is going to have a field day!

Data Lineage will go a long way in preventing this!

Thanks for reading!

If you enjoyed this thread, consider leaving a like and follow me @farazmunshi for more on ML and AI.

See you!


See All