-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SR-10374] URLCache: init method and first time sqlite database setup implemented #2000
[SR-10374] URLCache: init method and first time sqlite database setup implemented #2000
Conversation
2ccc7c0
to
7793c62
Compare
7793c62
to
be57ae3
Compare
@karthikkeyan Apologies for the very late response on this PR. I think implementations of URLCache and URLCredentialStorage have been long pending and this looks like a great start! However, I've a few questions about this pull request:
|
I have failed to give a fair bit of a thought process about the Linux platform. Thanks for reminding of it, I will follow the discussion and see how best we can move this PR forward. |
be57ae3
to
ca9d675
Compare
@pushkarnk I have reimplemented store and retrieve logic using |
8875d8f
to
0f21c7c
Compare
@swift-ci test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Full disclosure: I didn’t know about this PR, and I started working on my own URLCache
(that's the reason for #2245). I have some parts that are present in this version (with some differences), but I focused in the memory cache first. I have ideas for the disk cache, but that's more complicated. I can also have a lot more feedback about things I know that will need to be handle (for example skipping some protocols from being cached). Are you still planning in working on this?
Foundation/URLCache.swift
Outdated
|
||
sharedSyncQ.sync { | ||
// Remove previous data before resetting new URLCache instance | ||
URLCache.sharedCache?.persistence?.deleteAllCaches() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the original implementation do this? Feels weird removing “user” data (even if it is a cache), just by assigning another value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not.
Foundation/URLCache.swift
Outdated
|
||
let path: String | ||
|
||
var cacheSizeInDesk: Int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: cacheSizeInDisk
let attributes = try FileManager.default.attributesOfItem(atPath: path.appending(fileName)) | ||
total += (attributes[.size] as? Int) ?? 0 | ||
} | ||
return total |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there plan to keep this value updated as cached responses are added and evicted? Feels like with a big cache and a slow disk, each of these calls might take a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me add a FIXME
comment, I will create a separate PR for optimizing this. My main focus in this PR is to create an initial setup and kick start the implementation. Hope that's ok.
Foundation/URLCache.swift
Outdated
func setupCacheDirectory() { | ||
do { | ||
if FileManager.default.fileExists(atPath: path) { | ||
try FileManager.default.removeItem(atPath: path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn’t this delete the previous cache contents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
Foundation/URLCache.swift
Outdated
} | ||
|
||
private func urlForFileIdentifier(_ identifier: String) -> URL { | ||
return URL(fileURLWithPath: path + "/" + identifier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to use URL(fileURLWithPath: path, isDirectory: true).appendingPathComponent(identifier)
to handle platforms with other directory separators.
Foundation/URLCache.swift
Outdated
} | ||
|
||
let method = httpMethod ?? "GET" | ||
return urlString + method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is used as part of the path, shoulnd’t the output be escaped or hashed somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
@drodriguez Sorry for the delay response, yes I will be working on this PR. I was just waiting for someone to review this. Please give me your suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will see if I submit my memory cache side, since the two are complementary (but we overlap in some points in which we already differ).
Foundation/URLCache.swift
Outdated
} else { | ||
let fourMegaByte = 4 * 1024 * 1024 | ||
let twentyMegaByte = 20 * 1024 * 1024 | ||
let bundleIdentifier = Bundle.main.bundleIdentifier ?? UUID().uuidString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That UUID
at the end is a little bit worrying. The app will fill caches, and will never reuse the cached results, filling the disk of garbage. I can think at least of ProcessInfo.processInfo.processName
as a good fallback, but it is not unique (but the bundle identifier doesn’t have to be either). Maybe if no bundle identifier is found the best thing is not to have a disk cache at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this point to the temporary directory with a fixed name? (A sanitized process name does sound good to me!)
Foundation/URLCache.swift
Outdated
let fourMegaByte = 4 * 1024 * 1024 | ||
let twentyMegaByte = 20 * 1024 * 1024 | ||
let bundleIdentifier = Bundle.main.bundleIdentifier ?? UUID().uuidString | ||
let cacheDirectoryPath = FileManager.default.urls(for: .cachesDirectory, in: .userDomainMask).first?.path ?? "\(NSHomeDirectory())/Library/Caches" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same idea here. If we don’t find a good URL for .cachesDirectory
maybe it is better to not have disk cache.
The fallback has the problem of being very Darwin specific. For example NSHomeDirectory()
in Android is the root directory, where one cannot write at all. Also, it might want to use appendingPathComponent(…).appendingPathComponent(…)
to be nicer to Windows.
(There's another case below where the POSIX path separator is used instead of a higher level function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The home directory should be in a persistent writable path everywhere, much as possible. I know that Android is really weird about this, but that looks like incorrect mapping to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, please do not fall back like this. If FileManager
does not return a caches URL, consider not caching at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will remove the fallback if FileManager
for cachesDirectory
returned nill.
Foundation/URLCache.swift
Outdated
self.memoryCapacity = memoryCapacity | ||
self.diskCapacity = diskCapacity | ||
|
||
if let _path = path { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might also want to check that diskCapacity
is more than zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Darwin indeed does accept 0 arguments. My assumption is that memory capacity 0 means that as much as possible no data is kept in memory, and disk capacity 0 that no data is written to disk.
Foundation/URLCache.swift
Outdated
|
||
fileprivate extension URLRequest { | ||
|
||
var cacheFileIdentifier: String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a slightly different design here (I use a URLCacheKey
type, instead of String
), but one important thing is that generating a key from the URLRequest
depends on some details from the request beside the URL itself. So my equivalent method returns an optional.
The first thing is the protocol. Protocols like data:
or file:
should not be cached for obvious reasons. I also think unknown protocols should not be cached either.
Also, for known protocols like http:
or https:
(and maybe for ftp:
, but I don’t know that protocol that well), we should do a couple more things. For example:
- I don’t think you can do a HTTP request without a method, so that's a
return nil
in the case the protocol is HTTP. - I think some headers should be taken into account. The spec refers to
User-Agent
, but others likeAccept-Language
andRange
might be interesting. - I think the hash of the body contents should also be used.
- I also think one should not try to cache request with body streams, because that will consume it, which is not good.
- Finally I would hash everything and return an hexadecimal string, to have a valid file name.
Foundation/URLCache.swift
Outdated
} else { | ||
let fourMegaByte = 4 * 1024 * 1024 | ||
let twentyMegaByte = 20 * 1024 * 1024 | ||
let bundleIdentifier = Bundle.main.bundleIdentifier ?? UUID().uuidString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this point to the temporary directory with a fixed name? (A sanitized process name does sound good to me!)
Foundation/URLCache.swift
Outdated
let fourMegaByte = 4 * 1024 * 1024 | ||
let twentyMegaByte = 20 * 1024 * 1024 | ||
let bundleIdentifier = Bundle.main.bundleIdentifier ?? UUID().uuidString | ||
let cacheDirectoryPath = FileManager.default.urls(for: .cachesDirectory, in: .userDomainMask).first?.path ?? "\(NSHomeDirectory())/Library/Caches" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, please do not fall back like this. If FileManager
does not return a caches URL, consider not caching at all.
Foundation/URLCache.swift
Outdated
|
||
sharedSyncQ.sync { | ||
// Remove previous data before resetting new URLCache instance | ||
URLCache.sharedCache?.persistence?.deleteAllCaches() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not.
Foundation/URLCache.swift
Outdated
self.memoryCapacity = memoryCapacity | ||
self.diskCapacity = diskCapacity | ||
|
||
if let _path = path { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Darwin indeed does accept 0 arguments. My assumption is that memory capacity 0 means that as much as possible no data is kept in memory, and disk capacity 0 that no data is written to disk.
Foundation/URLCache.swift
Outdated
func setupCacheDirectory() { | ||
do { | ||
if FileManager.default.fileExists(atPath: path) { | ||
try FileManager.default.removeItem(atPath: path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
Foundation/URLCache.swift
Outdated
} | ||
|
||
let method = httpMethod ?? "GET" | ||
return urlString + method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
I want to shepherd this patch into the repository. Do you need help with changes here? |
@millenomi I am addressing all the comments added by you and Danial. I will let you know if I need any help. Thanks Lily. |
e1aa5c5
to
cd746f3
Compare
@millenomi @drodriguez I have addressed all the comments. I still have some doubts about the implementation of the property |
@swift-ci please test |
Foundation/URLCache.swift
Outdated
return nil | ||
} | ||
|
||
var hashString = "\(scheme)_\(method)_\(urlString)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The collision space of the default hash is smaller than you probably think. Can we at least check when we load a request that this is actually the request we were looking for?
Foundation/URLCache.swift
Outdated
|
||
let url = urlForFileIdentifier(fileIdentifier) | ||
guard let data = try? Data(contentsOf: url), | ||
let response = NSKeyedUnarchiver.unarchiveObject(with: data) as? CachedURLResponse else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use secure coding here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To use secure coding, just invoke archivedData(withRootObject:requiringSecureCoding:)
and unarchivedObject(of:from:)
instead. You should've gotten deprecation warnings for these calls.
Foundation/URLCache.swift
Outdated
} | ||
|
||
func saveCachedResponse(_ response: CachedURLResponse, for request: URLRequest) { | ||
let archivedData = NSKeyedArchiver.archivedData(withRootObject: response) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slightly less urgent, but I'd like you to use secure coding wherever you use coding.
Foundation/URLCache.swift
Outdated
do { | ||
try FileManager.default.createDirectory(atPath: path, withIntermediateDirectories: true) | ||
} catch { | ||
fatalError("Unable to create directories for URLCache: \(error.localizedDescription)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not introduce fatal errors for things the user may not be able to control. As I asked elsewhere: consider not loading if we do not have a suitable directory.
I'm going to commandeer this patch; I may drop the SQLite dependency, though. |
@millenomi I have pushed one more commit addressing the comments related to secure archiving. Also, I have updated the hashing mechanism for generating The one thing I am struggling though is to write a unit test for the internal property |
* init method implemented * URLCache.shared singleton object created with 4MB of memory space and 20MB of disk space * Directory and database file Cache.db file created under local directory * Sqlite Tables and Indices created in database * Unit tests added for URLCache to verify directory, file, tables and indices
…d NSKeyedUnarchiver
…eIdentifier hash mechanism updated.
d8e1009
to
b5dafd2
Compare
Closing this PR as |
init
method implementedURLCache.shared
singleton object created with 4MB of memory space and 20MB of disk spaceURLCache
to verify directory, file, tables, and indicesBug Link: https://bugs.swift.org/browse/SR-10374