Database Change to My Record Collection

Mittwoch 14 Dezember 2016

In my records table I used to have the full text of the label in each row, but I ended up with small typos in label names that ended up screwing things up. So I separated the labels out into their own table and added a foreign key to link the two tables. In my records admin instead of having a text field for the label I used a drop-down that was populated from the labels table. But this caused more problems than it solved, because it greatly complicated the code for searching records, updating records, and I had to write extra methods to add new labels to the labels table.

So I decided to get rid of the extra table and just put the full text back into the records table. I kept the drop-down menu in the admin section, but populated it with data pulled from the records table to eliminate my original problem. To my Record model I added a public static function that selects the distinct labels with the text as the key and the value to pass to the drop-down menu, and then was able to greatly simplify a lot of my code. 

I just pushed this up a few minutes ago, and so far haven't found any problems, but that doesn't mean they aren't there. If anyone finds any errors in the Record Collection page or the API please let me know.

As a side note, having PHPUnit tests for everything already set up made this a whole lot easier. I didn't have to go through and check every possible combination of things that could be searched or test adding and editing and all of that because I already had the tests written. I used to rely on a QA department to regression test everything, but PHPUnit makes all of that much easier and instead of waiting days to have QA people check everything I can do it in under a minute myself.

Etiketten: coding, music
Keine Kommentare

Backing Up Data to Amazon S3

Dienstag 13 Dezember 2016

I decided I wanted to backup my database somewhere other than my server. All of my code is in git so the only thing that could be lost in case of server errors is the database. To start I wrote a little shell script to dump the database using mysqldump. I wasn't sure where to put the SQL file to keep it off-server. My first thought was to put it in git, which was easy to do in the script. So I updated the script to add the file to git, commit the changes and push the repo up.

After a bit more thought I decided that might not be the best way to do it. It worked fine, but my usual workflow is I make all changes locally and then push it to git and then pull to production - I don't make any changes on the production server unless absolutely necessary. While adding files to git that don't exist in my dev environment shouldn't really cause any problems, I thought there must be a better way.

So I decided to put the dump file into an Amazon S3 bucket. Laravel can use S3 as a filesystem, as documented here, but I had tried to use this before and not had much luck. I saw that Amazon had a PHP package to interact with S3, which is SDKforPHP, so I thought I would try that out. After a little bit more digging I found that Amazon also has a package specifically for Laravel, which is located here. That turned out to be the winner. As opposed to trying to read pages of documentation for Laravel's file system or the Amazon SDK, all I need was a few lines of code and I was up and running. As a note, this package keeps the Amazon Keys and Secrets in the .env file, which is a lot better than keeping them in the filesystem.php config file like Laravel does. If you are going to use Laravel's S3 filesystem I suggest you update the filesystem.php file to pull them from the .env.

Now that I was able to upload files to S3 from a browser, the next step was to create an artisan command that I could add to my shell script. The Laravel documentation for this was clear and easy to follow. The only problem I had was a typo that for some reason didn't throw an error locally, but did on my production server. Other than that this is tested and working.

I had considered using S3 for this site in the past, but decided not to since I had problems with the Laravel S3 filesystem. Now that I've integrated with S3 so easily I may revisit that decision.

 

Etiketten: coding
Keine Kommentare

Redis

Sonntag 11 Dezember 2016

I've been messing around with Redis for a little while now and I'm using it in a couple places on this site. The first thing I did was I started caching some DB queries that get performed a lot, like the main blog page, the blog archives menu and the list of recent posts on the home page. Laravel's Cache facade makes caching really easy, and you can switch between the default cache driver which caches to files and Redis without changing any of the code. For some pages I use the Laravel Cache facade, for others I use the Redis facade to cache directly to Redis just for some variety. In general it is probably much better to use the Cache facade than the Redis facade because if you want to switch to a different caching mechanism with one change to the .env file instead of having to rewrite all of the code.

I also use Redis to queue some tasks which don't need to be done synchronously, but that's another post.

I never used Memcache because I didn't like the fact that it's all stored in memory, so if the server goes down you would lose all of the data in it, but Redis persists data to disk by default so it provides the speed of keeping data in memory with a very low risk of losing the data. In my case, I store the data in the DB and cache it in Redis, but if I were to start from scratch (and had plenty of RAM on my server) I would probably keep a lot more data in Redis.

So, to summarize, I think Redis is awesome and I will definitely make more use of it in my stack in the future.

Etiketten: coding
Keine Kommentare

Update on Pulling Data from Discogs

Donnerstag 01 Dezember 2016

I initially ran into problems pulling the data from discogs because I was using a package that provided a Laravel implementation of cURL called Laracurl, and it didn't provide a header that discogs needed. So I made a change to the package and got it working. After someone advised me that Guzzle was a better package than cURL I switched to that, and in the process rewrote my matching code. After running the new, improved, streamlined code I have now matched all but 250 of my records to the discogs data, and 50 of those unmatched records are white labels which may not be matchable. So I am pretty happy with that ratio and will be moving on to my next project now.

Almost all of my records now have a link to discogs and a thumbnail pulled from discogs in the record detail page. If anyone wants to see additional info on the records such as tracklisting, year of release, or anything else, that info will be available on discogs. I didn't see any need to duplicate that data locally.

The code I wrote to automatically match my records to discogs did not match everything 100% accurately, and I tried to review the matches to make sure they were accurate, but I may have missed a couple here and there. 

Thanks to Discogs.com for making a great API. 

Etiketten: coding, music
Keine Kommentare

Discogs.com is one of my favorite web sites. I discovered it 15 years ago, and have used it since to research records and music and such. I just recently discovered that they have an API, so I used it to search for each record in my collection, and if it found results, I imported the link to the discogs page as well as a link to the thumbnail images to my database.

The process was rather convoluted. To start I just did a search for the data as it was in my database - artist, title, label and catalog number. But discogs often has multiple entries for each record - maybe it was released in different countries, or re-released, or has different entries for promos, test pressings and white labels. So my starting algorithm was as follows:

  • Match the full data in my database to a search request to Discogs API.
  • If one and only one result is returned, take that one and link it.
  • If more than one result is returned, filter out the ones for promos, mispresses, white labels, etc.
  • If there is still just one result, use that one. Otherwise mark the number of results returned in the database and move onto the next record.

This matched a couple hundred out of the couple thousand records in my database. Most of my records got 0 matches to Discogs, some still had multiple matches - anywhere from 2 to 35. So I started reviewing the ones with multiples by hand, and I realized that for most of the records with under 5 matches the matches were pretty much equivalent. So for those I just took the first match and assigned it. This matched another couple hundred records.

Now I put aside the few remaining records with from 5 to 35 potential matches and focused on the thousand or so that had no matches. Reviewing some of them manually, I found that many of them were due to typos in my database. So my next step was to omit the artist field and just check the title, label and catalog number. I got another couple hundred matches using this method. Then I went on and just searched using the catalog number. This method matched about half of the remaining unmatched records - but I had to manually verify each match because some catalog numbers are not unique. 

Unfortunately I do not have a catalog number for every record in my collection, and as of now about 1/3 of the records in my database are still unmatched. For those that are matched, on the record information page you will now see a link to the discogs.com page for that record, as well as a thumbnail pulled from discogs.com if available. For anyone interested in collecting records I highly recommend discogs.com as it is by far the most comprehensive database of music releases I know of. 

Etiketten: personal, music
Keine Kommentare

Archiv