You are viewing a single comment's thread.

view the rest of the comments →

DickHertz ago

Why not ditch sql server for something thats containerizable?

PuttItOut ago

I suppose it's a matter of scale.

DickHertz ago

I see you were looking at Postgres, did that have problems scaling?

PuttItOut ago

Voats port code fully supports pg. We are expecting issues with scaling though, but we are also going to build out middle tier queueing and batch processing layers anyways.

We will need to stress test pg and address the issues before flipping prod over.

DickHertz ago

Do you run sql server with snapshot isolation turned on? I know JIRA requires it for a SQL Server install and MVCC is the only way Oracle and Postres run. I would think it would scale better with having to do dirty reads with no lock hints etc.

PuttItOut ago

We do a lot of dirty reads right now but that is because we have major contention points on high transaction tables. Voat needs to be denormalized in a few places to overcome this and since this brings into the equation large rewrites to facilitate scaling, it will require some dedication.

Voat and sites like it are very challenging because of the ratios of writing and the dependence on sorted data based specifically on these high transaction columns. While Voat was designed according to how you would normally design a proper database, the volumes and interdependence of the data requires an entirely different strategy to achieve proper scaling.

What is remarkable is what we have been able to do so far without rewriting entire areas. Today Voat's volume is consistently higher than ever in the past and we are running on smaller VMs than we ever have in the past.

DickHertz ago

Have you tested sql server with snapshot isolation turned on and using read committed snapshot isolation? I suspect that whatever you got with that would be similar to pg since readers can no longer block writers and vice versa at the cost of maintaining multiple versions of the truth at the same time.

PuttItOut ago

We nearly always use implicit transactions and act on a single source so no we do not explicitly set transaction levels nor do we have more than a few places that create an explicit transaction. We do have snapshot turned on.

You can take a look through the code and see if you see an obvious issues we could improve upon.

Here is the repo code: https://github.com/voat/voat/tree/Core/Voat/Voat.Business/Data/Repository

DickHertz ago

Looked at JIRA installation instructions and they make READ_COMMITTED_SNAPSHOT the default isolation level at the database level so implicit transactions use it without change. This is probably because their preferred database is pg so they already know there won't be problems using it. YMMV :)

ALTER DATABASE MyDatabase
SET ALLOW_SNAPSHOT_ISOLATION ON <-- this guy turns on row versioning which you take a hit for whether your using it or not, extra 14 bytes per row, and tempdb space needed to track the versioning

ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON -- this is the option JIRA uses to make this isolation level the default for implicit transactions

Here's an older link (but much newer than 2005 when this feature was implemented) all about it: https://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/

Another newer link from horse's mouth about it: https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/sql/snapshot-isolation-in-sql-server

As mentioned in the first link changing the default isolation level will not undo the effect of NOLOCK "hints" (which appear to be directives in this case) so those would all have to be taken out if you wanted avoid dirty reads. So if you had a really good integration test that could catch any errors that might creep in one possibility would be to stand up another clone of the database somewhere, fork the code and delete all the NOLOCKs sprinkled about, and then alter the database with the second statement above and let the testing rip. It does require that the database be quiesced before it takes effect but if it's a new database without anyone doing anything on it then that should not be a problem.

If you get the same answers without the locking and blocking and performance is at least as good - win!

Here's a dumb question: how does somebody contribute to development without doxxing themselves?

PuttItOut ago

Here's a dumb question: how does somebody contribute to development without doxxing themselves?

Create a dummy GitHub account and use that for any interactions with the repo.

And I'll look into this, the db was set up so long ago that I can't remember how and what we did. Looks like something to try.

DickHertz ago

If I did it last week I probably wouldn't remember how and what I did :)

see - I knew it was a stupid question

also if you're not actually using it you might pickup some speed turning it completely off as well.