This is a first post about great code review feedback, that I either gave or received. It will always consist of 3 parts: context, review feedback and explanation. You can go ahead and read previous ones here: https://www.michalbialecki.com/2019/06/21/code-reviews/. So
The context
This is a simple ASP.Net application, that is requesting database to get count of elements filtered by one parameter. In this case we need a number of users providing a country code, that is always two character string.
This is how DB schema looks like:
Code in .net app is written with Dapper nuget package, that extends functionalities of IDbCommand and offers entities mapping with considerably good performance. It looks like this:
public async Task<IEnumerable<UserDto>> GetCountByCountryCode(string countryCode) { using (var connection = new SqlConnection(ConnectionString)) { return await connection.QueryAsync<UserDto>( "SELECT count(*) FROM [Users] WHERE CountryCode = @CountryCode", new { CountryCode = countryCode }).ConfigureAwait(false); } }
Looks pretty standard, right? What is wrong here then?
Review feedback
Please convert
Explanation
Notice, that CountryCode in database schema is a varchar(2) and this means that it stores two 1-byte characters. On the contrary
The correct code should look like this:
public async Task<IEnumerable<UserDto>> GetCountByCountryCodeAsAnsi(string countryCode) { using (var connection = new SqlConnection(ConnectionString)) { return await connection.QueryAsync<UserDto>( "SELECT count(*) FROM [Users] WHERE CountryCode = @CountryCode", new { CountryCode = new DbString() { Value = countryCode, IsAnsi = true, Length = 2 } }) .ConfigureAwait(false); } }
If we run SQL Server Profiler and check what requests are we doing, this is what we will get:
As you can see
In order to check how that would impact the performance, I created a SQL table with 1000000(one
Before review it took 242
All code posted here you can find on my GitHub: https://github.com/mikuam/console-app-net-core
Nice, thanks for sharing that 🙂
Small code pieces not a lot of people would think to refactor. Especially that if you sum many small things like this one, performance can be significantly harmed.
My takeaway from this is “just use nvarchar for everything”, if using varchar makes the code that complicated.
I implemented a Roslyn analyzer for the issue (and some other ones)
https://github.com/olsh/sql-analyzer-net/
Is this still required in the newest version of dapper (2.0.35) I think is newest.
If my parameter declaration is this
p.Add(“@SessionToken”, SessionToken, dbType: DbType.String, direction: ParameterDirection.Input);
would i then change it to the following
.Add(“@SessionToken”, SessionToken, dbType: DbType.AnsiString, direction: ParameterDirection.Input);
There is an easier solution, and possibly slightly quicker, and no need to specify length as it’s absolute:
BEFORE: “SELECT count(*) FROM [Users] WHERE CountryCode = @CountryCode”, new blah blah …..
AFTER: $”SELECT count(*) FROM [Users] WHERE CountryCode = ‘{CountryCode}'”).ConfigureAwait(false)…
Here we are using the string interpolation operator, and simply adding single quotes around the {CountryCode}. I went from 2000ms queries to 1ms queries with this simple change.
Is this safe? I think this will allow the use of SQL Injection.
agreed
> As you can see first query needs to convert CountryCode parameter from nvarchar(4000) to varchar(2) in order to compare it.
I think at least in some cases the situation is even worse: Since it is not possible to convert nvarchar to char without possible information loss it is actually the row data that gets converted. After all a single convert from nvarchar to char should not even register as a performance loss. But converting the char(2) to nvarchar for each row to do the comparison is a significant hit, which is probably what you are seeing here.
Also @Nicholas…
> Here we are using the string interpolation operator…
Sure, it’s faster. But this is how SQL injection security nightmares starts. I have seen it first hand. Suppose country code comes from a form post from the web. What happens when someone posts
“‘; DELETE FROM [Users]; Select ‘PAWND”
as CountryCode?
nice! thanks for sharing. we saw this today. It’s the order of datatype conversion.
sql server solves this by scanning (parts of) the index
resulting in more IO
https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-ver15
we knew that, but now we know how dev can fix it.