Cordiance-experential-project

String Matching using Longest Common Subsequence with Words between two sentences

Project Description:

This project is an experiential Algorithm course project in collaboration with the Cordiance Company. The goal of this project is to get the closest possible UNSPSC code match for the Avalara Tax description. To begin this project, we were provided with two data files: UNSPSC and Avalara.

UNSPSC file contains code to be matched and it is divided into 4 levels:

Commodity level
Class level
Family level
Segment level

Avalara file contains Avalara Tax System code, its description and additional information related to the same.

The goal of this project is to get the closest possible UNSPSC code match for the Avalara Tax description.

Algorithm:

Documentation:

https://github.com/kasivisu4/cordiance-experential-project/blob/main/Group3-Final%20Report.pdf

Code Details:

The program to the project is written in Python language with help of Jupyter lab. All the cells are in sequence order from top to bottom. Each step mentioned in the algorithm is the function in the code. At the end of the notebook we will get the output in final_result object.

https://github.com/kasivisu4/cordiance-experential-project/blob/main/Experential_project_code.ipynb

Test Cases:

We created a test dataset of 30 records by manually mapping the corresponding UNSPSC code to Avalara description. Our algorithm is divided into two phases

Phase 1(Possible Outcome): In this phase, we were able to get one possibility for each level. Out of 30 we were able to match 27 possibilities
Phase 2: In this phase, we will get the max possibility of all the levels from phase 1. Out of 30 records we were able to match 20 records which gives us the accuracy of 66.7%.

https://github.com/kasivisu4/cordiance-experential-project/blob/main/TestRecords.xlsx

Observable Custom StopWords Visualizations:

We have performed word count on the Avalara File and tried to visualize the zoomable tree map for identifying the custom stop words

https://observablehq.com/@kasivisu4/avalara-data-analysis

Contributors:

Akhila Sulgante

Kasi Viswanath

Shital Waters

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Algorithm.png		Algorithm.png
Experential_project_code.ipynb		Experential_project_code.ipynb
Final_Experiential_project_group3.pptx		Final_Experiential_project_group3.pptx
Group3-Final Report.pdf		Group3-Final Report.pdf
LICENSE		LICENSE
README.md		README.md
TestRecords.xlsx		TestRecords.xlsx
best_result.json		best_result.json
final_result.json		final_result.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cordiance-experential-project

Project Description:

Algorithm:

Documentation:

Code Details:

Test Cases:

Observable Custom StopWords Visualizations:

Contributors:

About

Releases

Packages

Contributors 2

Languages

License

kasivisu4/cordiance-experential-project

Folders and files

Latest commit

History

Repository files navigation

Cordiance-experential-project

Project Description:

Algorithm:

Documentation:

Code Details:

Test Cases:

Observable Custom StopWords Visualizations:

Contributors:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages